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I, FRANCIS BARANY, Ph.D., hereby declare: 
1 . I am a co-inventor of the above-identified application. 



2. I am currently a Professor in the Department of Microbiology and 
Immunology, as well as the program of Biochemistry and Structural Biology at Weill 
Medical College of Cornell University in New York, New York. I was also concurrently 
Adjunct Professor at The Rockefeller University, New York, New York, as well as Director 
of Mutation Research at the Strang Cancer Prevention Center at Weill Medical College of 
Cornell. 

3, I received a B.A. in Chemistry from the University of Illinois at 
Chicago, Chicago, Illinois in 1976 and a Ph.D. in Microbiology from The Rockefeller 
University, New York, New York in 1 98 1 . 1 conducted postdoctoral work from 1981-1982 
in microbiology at The Rockefeller University and from 1982 to 1985 in molecular biology at 
John Hopkins University School of Medicine in Baltimore, Maryland. 
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4. My laboratory collaborates extensively with both academic and 
industrial researchers, including Memorial Sloan Kettering, Rockefeller University, UMDNJ, 
Princeton University, Weizmann Institute, Louisiana State University, Centers for Disease 
Control, and others. Our specialty is using thermostable enzymes and advanced DNA 
technology to help detect and characterize molecular changes in both cancers and infectious 
diseases. My laboratory has developed sensitive and specific assays using thermostable 
ligase combined with polymerase chain reaction ("PCR") for these applications. My 
laboratory is best known for developing the ligase chain reaction ("LCR"), ligase detection 
reaction ("LDR") and a programmable DNA chip (Universal Array). Other advances include 
the EndoV/Ligase mutation scanning assay and harmonized p53 mutation detection. I have 
served as the chair of the following National Institutes of Health Review Panels: Partnerships 
for Point of Care (POC) Diagnostic Technologies, Partnerships for Biodefense Food- and 
Water-borne Diseases, and Innovative Technologies for the Molecular Analysis of Cancer. I 
am inventor on over two dozen issued patents, comprising more than 10% of issued patents at 
Weill Cornell Medical College, and have authored over 100 peer-reviewed articles. In 2004, 1 
was honored as Medical Diagnostics Research leader. Scientific American 50. A copy of my 
Curriculum Vitae, listing these patents and publications, is attached as Exhibit 1 hereto 

5. Prior to the filing date of my above referenced patent application, it 
was well appreciated in the art that the ability to accurately identify low abundance nucleic 
acid sequence variations, including single nucleotide polymorphisms, insertions, deletions, or 
translocations at multiple adjacent, nearby, and distant genomic loci would have profound 
implications for the identification of genetic disorders, the diagnosis and treatment of cancer, 
and the detection of infectious diseases. Cancer, for example, can arise from the 
accumulation of mutations in genes controlling cell cycle, apoptosis, and genome integrity. 
Oncogenes may be activated by point mutations, translocations, or gene amplification, while 
tumor suppressor genes may be inactivated by point mutations, frameshift mutation and 
deletions. These mutations may be inherited or somatic, arising from exposure to 
environmental factors or from malfunctions in DNA replication and repair machinery. Since 
the capacity to detect these cancer related mutations would significantly enhance cancer 
detection and diagnosis, and identify the most effective and targeted cancer treatment 
protocols, there was a well recognized need in the art for an assay that could achieve early 
and accurate detection of these cancer related mutations. I am presenting this declaration to 
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demonstrate how the efforts of others in the art to develop a hybridization array-based 
detection assay to meet this need have failed and continue to fail. I am also presenting this 
declaration to show how my array with capture oligonucleotides having greater than sixteen 
nucleotides and sequences selected to hybridize with complementary oligonucleotide target 
sequences under uniform hybridization conditions across the array of oligonucleotides with 
minimal cross reactivity, where each capture oligonucleotide of the array differs in sequence 
from other adjacent capture oligonucleotides, when aligned to each other, by at least 25% of 
the nucleotides (hereafter identified as "My Array Design"), has overcome these failures and 
has successfiilly resolved this unmet and long-felt need. 

6. The development of an assay suitable for the detection of cancer 
related mutations was fraught with challenges. The first challenge was to identify an 
approach that could detect very low abundant target mutations within a patient sample 
containing a plurality of closely related non-target sequences {i.e., normal, non-mutant 
sequence). In primary tumors for example, normal stromal cell contamination can be as high 
as 70% of total cells. Therefore, a mutation present in only one of the two chromosomes of a 
tumor cell may represent as little as 15%o of the DNA sequence in a sample. In addition, early 
detection of such mutations requires the ability to detect as few as one mutant copy of a 
nucleic acid sequence in the presence of over 100 non-mutant copies of the nucleic acid 
sequence. Accordingly, the detection assay had to be highly sensitive. A second challenge 
was to develop a highly specific assay having the capacity to reproducibly discriminate and 
detect a plurality of often closely spaced mutations in multiple genes without generating 
false-positive or false-negative results. Finally, it was highly desirable to employ an assay 
that could achieve this highly sensitive and specific mutation detection in non-invasively 
collected patient samples to help reduce overall cost and, more importantly, alleviate patient 
discomfort. 

7. Although the advent of DNA array technology, based on direct 
hybridization of target sequence to the array, resulted in a paradigm shift in identifying 
expression changes and known SNPs on a genomic scale, it failed, and, as discussed below, 
continues to fail to meet the above noted challenges in detecting mutations, 
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8. Typical DNA hybridization arrays are designed to simultaneously 
discriminate and detect multiple target sequences differing in sequence by only one or a few 
nucleotides. Target sequence discrimination using a hybridization array depends on the 
highly specific binding affinity of the immobilized capture oligonucleotides to their 
complementary labeled target sequences. The hybridized labeled target sequences are 
subsequently detected and identified by their location of hybridization on the array surface. 



Caplure Capture 
Caplun; oiigunueicoiicic SI ohgonuck-oiidc S2 oligoiniclcoiicic 83 

is cotnfilemciHjry to the ditTers from S I hy a JilTcrs from S I differ by 
normal target sequence sinjsic nucleotKie ba>e a snigle nucleolidc hasc 



Libeled target sequence 
h>bridi/eci to eomplementary' 
SI capture (iliworiucleotide 
sequence 



SI 



Cv5 




C'afxure oi i gonuclcol idcs i iTiniobi i i zed 
tt) array surf ace 



Figure 1: A typical iJirect hybridization array having capture oligonucleotides designed to simultaneously 
discriminate and detect multiple nucleotide variations at multiple adjacent and nearby genetic loci. 



9. Figure 1 depicts a typical hybridization array having capture 
oligonucleotide probes immobilized on the array surface where the capture probes are 
designed to discriminate nucleotide variations (i.e., allelic variations) at multiple adjacent and 
nearby loci (e.g., SI, S2, and S3). The capture oligonucleotides responsible for allelic 
discrimination at a particular locus (e.g., SI, S2, or S3 in Figure 1) differ from each other by 
only a single nucleotide base substitution, insertion, or deletion, and, therefore, have very 
similar nucleotide sequences as shown in Figure 2 below. Consequently, these capture 
probes also have very similar melting temperatures (Tm). 
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Target : 5 * - CRTT.?Ua<aiyyiTATCATCITTC3<3TGTTTCCTA*rGATfSA 



Prc5bes : 



3 ' -TTTATAQTAGAAACC {SEQ. IDNOtSG), 

3 ' -TTTATAATAGAAACC (SEQ. IDN0:9A) 

3 ' -TTTATATTAGAAACC (SEQ. IDN0:9T) 

3 ' -TTTATACTAGAAACC (SEQ. IDN0:9C) 



TAAAt3AAAATS.,TCATCTTT(WTaTTTCCTATaAT3A 

3 ' -TTATAGTAGAAACCA (SEQ. ID KO:10T)' 



Probes : 
93% 
Identical 



87% 
Identical 



3 ' -TTATAQCACSAAACCA (SEQ. ID NO: IOC) 

Target, ; 5 ' - CATTAAACill-WtTATC^TCTTTG-STGTTTCCTATGATGA 

3 ' -TTATA©mGAAACCA (SEQ, IDNO:10G) 

3 ' -TTATAGAAGAAACCA (SEQ. ID NO:10A) 



\ 93% 
/ Identical 



Figure 2: Capture oligonucleotide probes of a typical hybridization array. Figure 2A shows that capture 
oligonucleotide probes designed to detect target sequence variations at one nucleotide position share 93% 
sequence identity while probes designed to detect target sequence variations at nearby genetic loci (i.e., probes 
ot"2A and 2B) share 87% sequence similarity. 



1 0. In addition to capture probes designed to detect sequence variations at 
a single locus or nearby loci, typical hybridization arrays also contain capture probes that are 
designed to detect distal mutations. Because capture probes are target-sequence specific, the 
nucleotide sequences of probes detecting distal target sequences will differ significantly in 
both sequence and melting temperature from other capture probes on the array designed to 
detect other distant mutations. 



1 1 . Figure 3 below illustrates some of the problematic results that emanate 
from the typical direct hybridization array designed to detect target sequences having 
overlapping sequence homology {e.g., target sequences that have only single nucleotide 
differences). It is possible for a single target to bind to multiple oligonucleotide probes with 
different, yet similar sequences due to mismatched cross hybridization. Cross hybridization 
will result in the generation of false-positive signals. 
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False Positive: signal 
resulting from cross- 



l abeled target 
hybridized to SI probe 
with complete 
complementarity 



hybridization between False Negative: loss of 
non-mutant target signal resulting from low 
sequence iind mutant level of mutant sequence 
capture oligonucleotide present in sample 




Capture oligonucleotide probes iinmobili/ed on 
array surface 

Figure 3. Problematic results of a typical hybridization array 



Another major drawback of hybridization arrays is their propensity to 



generate false-negative signals. As noted above, significant variability in nucleotide 
sequence and melting temperature exists between capture probes that are designed to detect 
distant mutations. Since optimal hybridization conditions for a target and its complementary 
capture probe are sequence specific, employing uniform, highly stringent hybridization 
conditions across the array that are suitable for all capture probe-target sequence pairs is 
difficult, if not impossible. The application of hybridization conditions that are overly 
stringent for some sequences will prevent target-probe hybridization and lead to the loss of 
signal (i.e., false-negative signal). Since the melting temperature between target and probe 
sequence varies across the array, stringent hybridization conditions will result in weak or 
missing signal from low abundant mutations. If non-stringent conditions are used to detect 
low level mutations, this will significantly increase the likelihood of cross-hybridization 
leading to false-positives. 



oligonucleotide probes with different, yet similar sequences due to mismatched cross 
hybridization, there are many other competitive processes that influence signal intensity 
values generated during array hybridization. Besides the desired target binding to probe 
(Figure 4A), there is the undesired probe self binding (Figure 4B), folding of target to reduce 
binding to probe (Figure 4C), and dimerization of adjacent probes (Figure 4D) as illustrated 
below. 



13. In addition to the possibility that a single target will bind to multiple 
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D 




Figure 4: Depiction of four competitive processes on signal intensity- values. Each panel shows a labeled (*) 
target and an immobilized probe on a microarray. (A) hybridization of a target to a probe; (B) probe self- 
folding; (C) folding of the target and (D) dimerization of adjacent probes (reproduced from Pozhitkov et al., 
Nucleic Acids Research 34(9):e66 (2006)) 



14. While a great deal of effort has been invested in developing design 
strategies that generate capture probes having minimal cross-reactivity to non-complementary 
target sequence, minimal cross-reactivity to other probe sequences, and do not undergo self- 
folding, none of these efforts have proven successful. These design strategies, typically 
based on the thermodynamic properties of the capture oligonucleotides {e.g., guanine- 
cytosine content, secondary structure, melting temperature, etc.), attempted to predict 
oligonucleotide duplex formation. Using these strategies, capture probes were designed so 
that a single mismatch base pair would, theoretically, significantly lower the binding affinity 
of the mismatched duplex compared to the corresponding perfectly matched duplex at a given 
temperature. These differential binding affinities of a target sequence to a mismatch or 
perfect match probe provide the basis of sequence discrimination, allowing for the 
identification of target sequence because it is bound to a perfect match but not a mismatched 
probe at a specified temperature. 



1 5. Figure 5 below depicts the melting curve profiles, i.e., the 
temperature-dependent dissociation of capture probe bound to its target sequence, for a set of 
mismatch and perfect match probes. Ideally, the melting profiles of the mismatch probe and 
perfect match probe would be tight to facilitate unambiguous differential signal detection. 
However, as indicated by the red arrow in the top panel, some mismatched probes have 
melting curves that overlap exactly with the perfect matched probes, making it impossible to 
distinguish target binding to the mismatched probe from the perfect match probes. The first 
derivatives of these melting curves shown in the bottom panel of Figure 5, more clearly 
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illustrates the overlap in melting temperatures between the mismatch and perfect match 
probes (red arrow). In addition, it shows that some mismatched probes produce signal over a 
broad temperature range, generating signal that overlaps with both mismatch and perfect 
match probes {see bottom panel of Figure 5). These mismatch probes also cannot be 
accurately distinguished from perfect match probes. 




Even in solution, 
some mismatched 
probes (MM) have 
melting curves like 
perfect matched (PM) 
probes. 



Some mismatched 
probes (MM) produce 
signal over a broad 
temperature range 
that overlaps with 
perfect matched (PM) 
probes 



Temperature [ C] 



Figure 5: Melting profiles of mismatch (MM) and perfect match (PM) oligonucleotides probes in solution 
(reproduced from www,genewave.com/images/manager/hyblive schemal .Jpg ). The top panel is a plot 
monitoring the decrease in signal intensity that occurs as the DNA duplex melts with increasing temperature. 
The bottom panel is a plot of the negative first derivative of the change in fluorescence (-dF/dT, the rate of 
change of fluorescence) versus temperature. The distinct peaks in this plot correspond to the melting 
temperature of each DNA duplex, 

1 6. Naiser et al., "Impact of Point-Mutations on the Hybridization Affinity 
of Surface-Bound DNA/DNA and RNA/DNA Oligonucleotide-Dupiexes: Comparison of 
Single Base Mismatches and Base BM\gQ?,," BMC Biotech. 8:48 (2008) ("Naiser") (attached 
hereto as Exhibit 2) describes a comprehensive analysis of how single nucleotide variations 
(referred to as "point defects") affect the hybridization of fluorescently labeled 
oligonucleotide targets to surface-bound oligonucleotide probes (Naiser at p. 49, col. 2, para. 
2), Naiser generated a set of point mutated probes derived from a common probe sequence 
motif that was complementary to a region of a target sequence {id. at figure legend of Figure 
1). Exemplary probe sequences representing single nucleotide substitutions, insertions, and 
deletions at the first two bases {i.e., first two defect positions) of the probe sequence are 
depicted in Figure 1 of Naiser (reproduced below as Figure 6). 
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3 ■ -TA TTACTSCACCTGftC-S • 
3 ' -TA^ •TTACTSCACCTCftC-S ' 
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3 ■ - T TTACTSGACCTOAC- 5 ' 



Figure 6: Figure 1 of Naiser et al., BMC Biotech. 8:48 (2008) showing the comprehensive set of point- 
mutated probes is derived from a common probe sequence motif which is complementary to the target sequence. 
Probe sequences are shown for the first two defect positions only. 

17. The probe sequences were arranged on a microarray surface as a 
compact feature block and a sample containing a single target nucleotide sequence was 
contacted with the array surface to facilitate target-probe hybridization {id at p. 49, col. 2, 
para. 2, and Figure 1). Hybridization signals resulting from target sequence hybridization to 
individual probes in the probe set were plotted against the position of the defect in the probe 
sequence to create a defect profile {id at figure legend of Figure 1). The defect profile shown 
in Figure 7 (below) provides a direct comparison of the binding affinities for a plurality of 
mismatch oligonucleotide duplexes {i.e., duplexes between target and probe where the probes 
differ from the target sequence by a single base mismatch, insertion, or deletion), and 
demonstrate the considerable variability in binding affinities that exist between mismatched 
probe sequences. 
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0 T T G A C T T TCGTTTQTG 
0 2 4 6 8 10 12 ' 14 16 

Defect position 

Figure 7: Figure 6 of Naiser et al., BMC Biotech. 8:48 (2008) showing the direct comparison of single base 
mismatches, insertions and deletions. The 16 mer probe sequence motif 3'-TTGACTTTCGTTTCTG-5' is 
complementary to the target BEL Hybridization signals (data processing: raw fluorescence intensities; solution- 
background correction) of single base mismatch probes with substituent bases A (red crosses), C (green circles), 
G (blue stars), T (cyan triangles), running average of mismatch intensities (black line); perfect match probe 
signals (grey symbols) single base insertion probes (solid lines) with insertion bases A (red), C (green), G 
(blue), T (cyan). Hybridization signals of single base deletions (orange dashed line) are comparable to that of 
mismatches at the same position. Increased hybridization signals of certain insertion defects are due to 
positional degeneracy of base bulges 

1 8. The defect profiles of Naiser reveal that the dominant parameter 
determining oligonucleotide probe-target affinity - on the microarray surface - is the position 
of the defect (Naiser at p. 50, para, bridging col. 1 and 2 and p. 64, col. 2, para. 1). The grey 
symbols in the defect plot represent the intensity signal generated by perfect match probe 
binding to its complementary sequence, A moving average of the hybridization signal across 
defect positions reveals a trough-like "mean profile" curve (represented by a solid black line 
in the defect plots) that provides a reasonable approximation for the average position 
dependence obtained from a large number of different sequence motifs {id ). For 16mer 
duplexes, for example, a single base mismatch in the center of the duplex typically results in 
25% of the perfect match (PM) hybridization signal while a mismatch near or at the end of 
the duplex results in 50% to 75% of the PM hybridization signal. However, for individual 
sequence motifs, significant sequence-dependent deviations from the simple position 
dependence were also observed. For example, a single base (G) insertion between positions 
6-7 and 10-12 generates a mismatch signal intensity that is approximately 75% of the perfect 
match signal. Unexpectedly, a single base substitution at the central position 8 (T->G) 
generates a mismatch signal intensity that approaches 75% of the perfect match signal. This 
defect plot clearly illustrates the unpredictable influence that a single nucleotide variafion can 
have on probe-target binding affinity. 
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1 9. Figure 8 below is a fluorescence micrograph of the microarray feature- 
block illustrating the problem associated with trying to accurately discriminate mismatch and 
perfect match probe binding to target sequence based on hybridization signal intensity. The 
microarray feature-block comprises variations of a 16mer probe sequence motif (/.e., probe 
sequences varying by single base insertions, deletions, and substitutions) subject to 
hybridization to a single nucleic acid target sequence. Each 3x3 sub-array comprises one 
perfect matching probe, three single base mismatch probes, four insertion probes, and one 
single base deletion probe. While the signal intensity generated by the perfect match probe is 
distinguishable from the signal intensity generated by the mismatch probes in the center two 
sub-arrays outlined in red (i.e., the single brightest square within each sub-array represents 
the intensity of the perfect match probe), there is still considerable signal observed at other 
positions. Further, where mismatches are elsewhere, it is even more difficult to distinguish 
signal intensities generated by perfect match and mismatch probes. 




Figure 8: Fluorescence micrograph of a microarray feature-block comprising variations of the 16 mer probe 
sequence motif 3'-TATTACTGGACCTGAC-5'. Microarray hybridization was performed with the 5'-Cy3- 
labeled RNA oligonucleotide target 3'-AACUCGCUAUAAUGACCUGGACUG-5' (target concentration: 1 nM 
in 5 X SSPE, pH 7.4, 0.01% Tween-20, T = 30°C). Each 3^3 sub-array comprises one perfect matching probe, 
three single base mismatch probes, four insertion probes and one single base deletion probe. Figure reproduced 
from Naiser et al., "Position Dependent Mismatch Discrimination of DNA Microarrays - Experiments and 
Model," BMC Bioinformatics 9:509 (2008) (attached hereto as Exhibit 3). 

20. Naiser's findings are consistent with those reported in an earlier study 
by Pozhitkov et al., "Test of rRNA Hybridization to Microarrays Suggest that Hybridization 
Characteristics of Oligonucleotide Probes for Species Discrimination Cannot be Predicted," 
Nucleic Acids Research 34(9):e66 (2006) ("Pozhitkov") (attached hereto as Exhibit 4). 
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Pozhitkov assessed the utility of in silico predictions of probe-target duplex stabilities using 
DNA arrays for the detection of rRNA sequences (Pozhitkov at p. 2, col. 1, para. 3). 

2 1 . Pozhitkov 's assessment of the effects of mismatches in the probe 
sequence on signal intensity values generated when hybridized to the non-mismatched target 
sequence also revealed that mismatch position, mismatch type, and the type of neighboring 
nucleotides surrounding the defect have significant effects on the normalized signal intensity 
values (id. at p. 7, col. 1, last para,). "Moving the MM base away from the 5' or 3' termini to 
the center of the probe significantly decreases signal intensities (Figure 3). ... However, we 
emphasize that this was an average result, and note that in some individual cases, MM probes 
with central mismatches (positions 9-1 1) were observed to have signal intensities that were 
equal to or 1.6 times higher than the corresponding perfectly matched probe" {id. at p. 7, col. 
2., para. 2). 

22. The implications of Pozhitkov's and Naiser's findings are that direct 
hybridization methods which attempt to simultaneously discriminate and detect nucleic acid 
sequence variations are inadequate because of the unpredictable cross-hybridization between 
target sequence and mismatch or perfect match probe sequences. Thermodynamic 
parameters are simply not capable of accurately predicting mismatch and perfect match 
oligonucleotide duplex formation. Accordingly, despite tremendous efforts to improve the 
reliability and reproducibility of the technology, the findings of Naiser and Pozhitkov 
described above, clearly indicate that a fundamental understanding of the technology is 
lacking and that even current approaches for microarray design are inadequate. Likewise, 
this same problem extends to any assay format where the target sequence is detected and 
distinguished from other sequences by its hybridization to a complementary sequence. 

Summary of U.S. Patent No. 5,510,270 to Fodor et al. ("Fodor") 

23. Fodor relates to a method for synthesizing and screening polymers on a 
solid substrate (Fodor at abstract). The method involves providing a substrate which may 
include linker molecules on its surface {id. at col. 8, lines 46-48). On the substrate or a distal 
end of the linker molecules, a functional group with a protective group is provided {id. at 
lines 58-59). The protective group may be removed upon exposure to radiation, electric 
fields, electric currents, or other activators to expose the functional group {id. at lines 59-62). 
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Concurrently or after exposure of a known region of the substrate to light, the surface is 
contacted with a first monomer unit Ml which reacts with the functional group which has 
been exposed by the deprotection step {id. at col. 9, lines 1 1^14). Thereafter, second regions 
of the surface (which may include the first region) are exposed to light and contacted with a 
second monomer M2 (which may or may not be the same as Ml) having a protective group 
{id. at lines 26-29). These steps are repeated until the substrate includes desired polymers of 
desired lengths {id. at lines 37-38). Monomers may include amino acids, nucleofides, 
pentoses, and hexoses {id. at col. 6, lines 14-18). 

24. Fodor does not teach arrays of oligonucleotides on a solid support 
where each oligonucleotide of the array differs in sequence from other adjacent oligonucleotides 
when aligned to each other by at least 25% of the nucleotides. 

Summary of U.S. Patent No. 5,474, 796 to Brennan et al. ("Brennan ") 

25. Brerman relates to an apparatus and methods for making arrays having 
fianctionalized binding sites on a support surface and conducting a large number of chemical 
reactions on the support surface {see Brennan at abstract and col, 2, lines 1 1-12). Brennan 
fiirther relates to a method of determining or confirming the nucleotide sequence of a target 
nucleic acid where the target nucleic acid is labeled and hybridized to oligonucleotides of 
known sequence bound to sites on the array plate {id. at col. 3 lines 11-15). 

26. It is my understanding that the United States Patent and Trademark Office 
("PTO") considers Brennan's disclosure of arrays having 3-mers and 10-mers attached thereto 
where every possible permutation of the 3-mer or 10-mer is provided, to be the same as each 
capture oligonucleotide of an array differing in sequence from other adjacent capture 
oligonucleotides when aligned to each other by at least 25%. I respectfully disagree for the 
reasons set forth below. 

27. Although Brennan teaches that the resulting 10-mer oligonucleotides on an 
array represent all permutations of the 10-mer sequence, each 10-mer oligonucleotide of the array 
does not differ in sequence from other adjacent 10-mer oligonucleotides, when aligned to each 
other, by at least 25% of the nucleotides. In fact, following the method of oligonucleotide 
synthesis taught by Brennan, each "oligonucleotide element, moving in a 5 '-3' direcdon, is 
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identical to the preceding element in nucleotide sequence except that it deletes the 5 '-most 
nucleotide and adds a 3 '-most oligonucleotide" (Brennan at col. 9, lines 49-53). Therefore, 
adjacent oligonucleotides formed according to the method of Brennan have significant 
sequence similarity when aligned. In other words, nine out often nucleotides of adjacent 
lOmers are the same, so that adjacent lOmers have 90% sequence identity when aligned, 
differing by only 10% (see ClustalW2 pairwise sequence alignment results attached hereto as 
Exhibit 5). 

Summary of U.S. Patent No. 5,594,121 to Froehler et al. ("Froehler") 

28. Froehler discloses oligomers containing 7-deaza-7-substituted purines 
and related analogs that have enhanced ability for double-and triple-helix formation with 
single- or double-stranded target nucleic acid sequences (Froehler at abstract). Such 
oligomer analog compositions can be used for diagnostic assays that employ methods where 
the oligomer or nucleic acid to be detected is covalently attached to a solid support {id. at col. 
33, lines 18-21). 

29. Froehler teaches that oligomers {e.g. , dimers - hexamers) are useful as 
synthons {i.e., structural unit within a unit) for producing longer oligomers {id. at col. 6, lines 64- 
65 and col. 7, lines 59-60). However, Froehler fails to teach oligomers on a solid support where 
each oligomer differs in sequence from other adjacent oligomers, when aligned to each other by 
at least 25% of the nucleotides. 

30. The combination of Fodor, Brennan, and Froehler does not teach arrays of 
oligonucleotides on a solid support where each capture oligonucleotide differs in sequence from 
other adjacent capture oligonucleotides, when aligned to each other by at least 25% and hybridize 
to complementary oligonucleotide target sequences under uniform hybridization conditions across 
the array of oligonucleotides. 

Summary of U.S. Patent No. 5,527,681 to Holmes ("Holmes") 

3 1 . Holmes relates to methods, devices, and compositions for synthesis 
and use of diverse molecular sequences on a substrate (Holmes at col. 1 , lines 65-66). In 
particular. Holmes discloses the synthesis of an array of polymers in which individual 
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monomers in a lead polymer are systematically substituted with monomers from one or more 
basic sets of monomers (id at col. 2, lines 1-4). On the substrate or a distal end of linker 
molecules, a functional group with a protective group is provided (id. at col. 7, lines 51-53). 
The protective group may be removed upon exposure to a chemical reagent, radiation, 
electric fields, electric currents, or other activators to expose the functional group (id. at lines 
53-57). Concurrently or after exposure of a known region of the substrate to light, the 
surface is contacted with a first monomer unit Ml which reacts with the fianctional group 
which has been exposed by the deprotection step (id. at col. 8, lines 5-9). Thereafter, second 
regions of the surface (which may include the first region) are exposed to light and contacted 
with a second monomer M2 (which may or may not be the same as Ml) having a protective 
group (id. at lines 21-26). These steps are repeated unfil the substrate includes desired 
polymers of desired lengths (id. at lines 38-39). Monomers may include amino acids, 
nucleotides, pentoses, and hexoses (id. at col. 4, lines 6-1 1). 

32. Holmes, like Brennan and Froehler, does not teach arrays of 
oligonucleotides on a solid support where each oligonucleotide of the array differs in sequence 
from other adjacent oligonucleotides, when aligned to each other by at least 25%. Further, the 
combination of Holmes, Brennan, and Froehler fails to teach a method that such oligonucleotides 
are attached to a solid support and hybridize to complementary oligonucleotide target sequences 
under uniform hybridization conditions across the array of oligonucleotides. 

Summary of the Present Invention: 

33. My Array Design avoids all of the aforementioned problems associated 
with typical hybridization arrays (i.e., target-capture probe cross-hybridization and false- 
positive/negative signal generation). Identifying one or more target nucleotide sequences 
using My Array Design may employ a ligase detection reaction (LDR) followed by a high- 
throughput method of detection to decouple mutation discrimination from hybridization and 
detection. Hybridization is carried out using divergent probe sequences that are not 
homologous to the target sequence being detected or any other known genomic sequence. 
Although divergent in sequence, these probes are carefully designed to have very similar 
hybridization properties. This strategy significantly reduces cross-hybridization to enhance 
the specificity of target discrimination, while allowing for the use of uniform hybridization 
conditions across the array to facilitate a high-throughput assay format. 
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34. A schematic representation of one embodiment of my invention using 
My Array Design to detect single base changes in a gene is provided in Figure 13 below. 



Allele Specific Probe 



A. 



r 

Label 



"1 

Target Probe with 
specific Zipcode 



Only ligation products 
carry fluorescent label 
B. \ r« 



Zpl 




One address can correspond 
to multiple alleles, 
distinguished by label. 



Address 1 

Homozygous: 
T allele only 



Address 2 

Heterozygous: 
C and T alleles 



Figure 13: (A) LDR is performed using a common probe for each genetic locus which contains a 
unique addressable array-specific portion (Zpl or Zp2) and allele-specitlc probes, each containing a 
unique detectable reporter label {e.g., Cy3 or Cy5), Ligation of the two probes occurs only when there 
is perfect complementarity at the junction. (B) The multiplexed ligation products are captured on an 
addressable array containing capture probes that are complementary to the addressable array-specific 
portion of the common LDR probe. 



35. As illustrated in Figure 1 3A, a plurality of oligonucleotide probe sets 
are used, where each probe set is characterized by (a) a first oligonucleotide probe having a 
target specific portion and an addressable array specific portion (Zpl or Zp2) that is distinct 
from the target sequence and different for each gene locus that is interrogated, and (b) a 
second oligonucleotide probe having an allele-speciflc target portion and a unique detectable 
reporter label portion (e.g., Cy5 or Cy3). In an LDR process, the oligonucleotide probes are 
complementary to only one strand of the target nucleic acid as shown above, resulting in the 
linear amplification of the target nucleotide sequence. 



36. When oligonucleotide probes of a probe set hybridize adjacent to one 
another on a target sequence, ligation occurs only if there is perfect complementarity at the 
ligation junction. The resulting ligation product contains (a) the addressable array specific 
portion, (b) the target-specific portions, and (c) the detectable reporter label. The addressable 
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array-specific portion of a ligation product is complementary to a capture oligonucleotide 
immobilized at a particular site or "address" on the solid support. As depicted in Figure 13B 
above, the addressable array-specific sequences together with a detectable label can 
discriminate between a plurality of different target sequences. 



37. The plurality of capture oligonucleotides immobilized on a solid 
support are designed to differ substantially from each other in their nucleotide sequence, yet 
all have the same or similar melting temperature. This design strategy drastically minimizes 
any chance of cross-hybridization leading to false-positive signals, while allowing for 
simultaneous capture of a plurality ligation products, by their addressable array sequence, 
under uniform hybridization conditions across the array. In accordance with My Array 
Design, we have designed 24-mer capture oligonucleotides that differed from each other by at 
least 6 bases or at least 25% when aligned to each other based on sequence similarities (see 
Figure 14 below), yet have the same or very similar melting temperatures (Tm). 

Probes: Zip 12 (2-4-4-6-1-1) =24 mer 

Target: 3<-Tmc OyiCZ CCliT TGGA hC3C AC-SC - LOR FHOIKiCT i 24/24 match 

5'-ATCG GGTA GGTA ACCT TGCG T6CG-3 ' SEQ ID NO: 7 hybridization 

\ Probes : 
\ 25% or more 
/ Different 



Zip 14 (4-4-6-6-3-l)=24 mer 

3'-TAGC „„.,,^ CC&T A<X}C 

5 '-GGTA GGTA ACCT ACCT CAGC TGCG- 3 ' SEQ ID NO: 8 ''^'^ hybridization 



Target; 3 ' -TAGC CC&T A<X}C IJM PSODUCT 1 ^ 

>Jt.AT IXXiA AC;GC 12/24 match 



Target: 3 ' -tmc ^^^^ aC^c ACGC ~ UDR PRODUCT 1 13/24 match 



5' -GGTA GGTA ACCT ACCT CAGC TGCG- 3 ' SEQ ID NO: 8 



No hybridization 



Figure 14. Capture oligonucleotides of the present invention are designed to differ from each other by at least 
25% of their nucleotide sequence when aligned. Using this design strategy cross-hybridization between non- 
complementary addressable array portions and capture oligonucleotides will not occur. 

38. As illustrated in Figure 14, cross-hybridization between an 
addressable array sequence and the wrong capture oligonucleotide probe sequence will not 
occur because of the extent of non-complementarity that exists between them. Because the 
capture oligonucleotide sequences remain constant (i.e., the sequence is not target-specific), 
and their complements can be appended to any set of LDR primers, the addressable aiTays of 
My Array Design have universal application. 
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39. As summarized above, My Array Design provides for the highly 
sensitive and specific detection and discrimination of target sequences that differ by only a 
single nucleotide substitution, deletion, or insertion in a sample. As summarized below and 
described in detail in the attached peer-reviewed publications, My Array Design provides a 
rapid and reliable method for the detection of genomic mutations {e.g., genetic disease 
mutations and cancer related mutations), promoter methylation, and infectious diseases (e.g., 
bacterial, fungal and viral infections). 

Cancer Detection 

40. Gerry et al., "Universal DNA Microarray Method for Multiplex 
Detection of Low Abundance Point Mutations," J. Mol. Biol. 292:251-62 (1999) ("Gerry") 
(attached hereto as Exhibit 6) demonstrates the simultaneous detection of seven of the most 
common point mutations in the K-ra^ gene that are involved in colorectal cancer using My 
Array Design coupled to an LDR assay (Gerry at abstract and p. 255, para, bridging col. 1 
and 2). LDR probe sets comprising an allele-specific probe with an addressable array portion 
(also referred to as "zip code") and a common probe having a fluorescent reporter label were 
designed to detect the seven mutations in nine individual DNA samples obtained from cell 
lines or paraffin-embedded tumor tissue (id at p. 255, col. 2, para. 2 and Table 3). Following 
LDR, the ligated, fluorescently labeled LDR products were hybridized to an addressable 
DNA array containing capture oligonucleotides complementary to the addressable array 
sequences of the LDR products (id.). 

4 1 . Using this method all K-ras mutations in the himor and cell line DNA 
were correctly identified without the generation of false-positive or negative signals (id. at p. 
256, para, bridging col. 1 and 2). To determine the limit of detection of low level mutations 
in wild-type DNA (i.e., assay sensitivity), mutant DNA was diluted in wild-type DNA in 
ratios ranging from 1:20 to 1 :500 (id at p. 257, col. 2, para. 2). As shown in Figure 5 of 
Gerry, positive hybridization signal was quantifiable at a dilution of 1 :200 with a signal-to- 
noise ratio of 2: 1 (id.). These results confirmed the utility of My Array Design for detecting 
multiple nucleotide polymorphisms that are present in less than 1% of the total DNA (id.). 
We have subsequently fabricated a polymer flow-through biochip assembly that consists of a 
continuous-flow LDR microchip and a microarray chip that is capable of detecting one K-ras 
mutant sequence in the presence of 100 normal sequences (Hashimoto et al., "Ligase 
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Detection Reaction/Hybridization Assays Using Three-Dimensional Microfluidic Networks 
for the Detection of Low Abundant DNA Point Mutations," Anal Chem 77:3243-3255 at 
abstract (2005) (attached hereto as Exhibit 7)). 

42. In addition to single nucleotide substitution mutations, many cancers 
involve small nucleotide insertions and deletions which result in frameshift mutations. For 
example, a number of small insertions and deletions are found within the BRCAl and 
BRCA2 genes that are associated with inherited breast and ovarian cancer. A number of 
these insertion and deletion mutations are refractory to detection by direct hybridization array 
approaches, requiring the development of an alternative method. As described below, our 
technology is sensitive enough to detect sporadic mutations directly from tumor tissue within 
the p53 gene, which is involved in nearly half of all human cancers. 

43. Favis et al., "Universal DNA Array Detection of Small Insertions and 
Deletion in BRCAl and BRCA2," Nat. Biotech. 18:561-564 (2000) ("Favis") (attached 
hereto as Exhibit 8), demonstrates the capacity of My Array Design to reliably and 
reproducibly detect small nucleotide insertions and deletions using the BRCAl and BRCA2 
genes as a model system. As shown in Figure I of Favis, the method of detecting insertion 
and deletion mutations coupled a multiplex PGR step to LDR and My Array Design. This 
approach reproducibly detected both insertion and deletion mutations in BRCAl and BRCA2 
{i.e., BRCAl 185delAG; BRCAl 5382insC; and BRCA2 6174delT). No cross-hybridization 
was detected, supporting the specificity of the method, and the reproducibility of the results 
were confirmed using a gel-based method (id. at p. 563, col. 2, para. 2). Further, even the 
presence of mutations in pooled samples was detected. 

44. p53 mutations are observed in approximately one-half of all human 
cancers. My Array Design, when applied to the detection of p53 mutational status of clinical 
biopsy samples containing <5% tumor cells, was able to detect all mutations that were 
detected by direct sequencing and a yeast functional assay (Fouquet et al., "Rapid and 
Sensitive p53 Alteration Analysis in Biopsies from Lung Cancer Patients Using a Functional 
Assay and A Universal Oligonucleotide Array: A Prospective Study," Clin Cancer Res 
10:3479-3489 at abstract and p. 3483, col. 2, para. 2 (2004) (attached hereto as Exhibit 9)). 
This approach was also used to detect 58 different p53 mutations in undissected colon tumor 
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DNA samples (Favis et al, "Harmonized Microarray/Mutation Scamiing Analysis of TP53 
Mutations in Undissected Colorectal Tumors," Human Mutation 24:63-75 (2004) (attached 
hereto as Exhibit 10)). 

45. An important feature of My Array Design is that it is not one- 
dimensional in it diagnostic utility. In addition to being a highly sensitive and robust method 
for detecting single base substitutions, insertions, and deletions involved in cancer 
development and progression, the method of coupling LDR to My Array Design has been 
successfully applied to the determination of promoter methylation status (Cheng et al, 
"Multiplexed Profiling of Candidate Genes for CpG Island Methylation Status Using a 
Flexible PCR/LDR/Universal Array Assay," Genome Research 16(2):282-9 at abstract 
(2006) (attached hereto as Exhibit 1 1). DNA methylation in CpG islands is associated with 
transcriptional silencing, and the ability to accurately determine cytosine methylation status 
in promoter CpG dinucleotides provides diagnostic and prognostic value for many human 
cancers. My Array Design demonstrated the ability to clearly distinguish different levels of 
methylation at 75 independent CpG dinucleotides in the promoter regions of 15 tumor 
suppressor genes (id.). When compared with an independent pyrosequencing method at a 
single promoter, the two approaches gave good correlation. In a study using 15 promoter 
regions and seven blinded tumor cell lines, our technology was capable of distinguishing 
methylation profiles that identified cancer cell lines derived from the same origins {id.). 
Further, our approach has the sensitivity required to detect the presence of methylation at 
0.5% without selective PGR amplification, and at 0.05% with methyl-specific PGR 
amplification. This would correspond to idendfying one tumor cell in 200 normal cells, or 
one tumor cell in 2,000 normal cells, respectively. This level of sensitivity holds the promise 
for early detection of colon cancer in DNA isolated from stool or serum. 

Infectious Disease 

46. My Array Design can be utilized for identifying and distinguishing 
infectious agents {e.g., bacterial, viral, and fungal) in many areas of biomedical science, 
including health care, biological defense, and environmental monitoring. The detection and 
identification of infectious agents must be highly sensitive and specific to distinguish closely 
related species or serotypes whose genomic sequences in specific regions differ at only a few 
nucleotide positions. 
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47. Das et al., "Detection and Serotyping of Dengue Virus in Serum 
Samples by Multiplex Reverse Transcriptase PCR-Ligase Detection Reaction Assay," J. Clin. 
Microbiol 46(10):3276-84 (2008) ("Das") (attached hereto as Exhibit 12) demonstrates the 
simultaneous serotyping and genotyping of dengue virus (DENV) in viral cultures and patient 
samples by coupling PGR based amplification to LDR and My Array Design. The assay 
accurately identified and serotyped DENV in 350 archived acute-phase serum samples, 
demonstrating 98.7% sensitivity and 98.4% specificity for detection {id. at p. 3280, col 2, 
para. 2). The detection limit for the assay ranged from 0.004 to 0.7 plaque forming units 
(FPU)/ reaction, comparable to those reported for other techniques {id. at para, bridging pp. 
3282-83). The assay was highly specific for the detection of DENV with no cross reactivity 
to seven other similar flavivirus {id. at p 3281, col. 1, para. 2 and Figure 2). We have also 
employed this assay for the successful identification of West Nile viral strains, which also 
exhibit considerable genomic diversity, in clinical samples {see Rondini et al, "Development 
of Multiplex PCR-Ligase Detection Reaction Assay for Detection of West Nile Virus," J. 
Clin. Microbiol. 46:2269-79 (2008), attached hereto as Exhibit 13) 

Conclusion: 

48. There are numerous advantages afforded by My Array Design. Direct 
hybridization arrays were designed to detect single nucleotide polymorphisms, with 
discrimination based on hybridization of target sequences to perfectly matched or 
mismatched probes. However, as discussed supra, the utility of direct hybridization methods 
to accurately and reproducibly detect and discriminate even these single nucleotide variations 
is questionable. The extent of sequence similarity between mismatch and perfect match 
probe sequences enables cross-hybridization and leads to the generation of both false-positive 
and false-negative signals. The findings of Naiser and Pozhitkov suggest that the problem of 
cross-hybridization is only further compounded by the unpredictable nature of 
oligonucleotide probe-target hybridization. Direct hybridization methods are not well suited 
for the detection of most other types of nucleic acid sequence variations. In fact, in most 
cases, the ability to detect insertion/deletion mutations has proven intractable. 

49. In contrast to direct hybridization methods. My Array Design has 
demonstrated the ability to detect insertion/deletion mutations, mononucleotide and 
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dinucleotide repeats, and even methylation of CpG islands. In addition, it can be used to 
identify and quantify splice site changes, quantify RNA levels for gene expression profiling, 
and determine DNA copy levels changes, loss of heterozygosity, and SNPs for genome-wide 
association studies. 

50. A significant advantage of the My Array Design is that it relies on 
divergent capture-specific probe sequences, designed to differ in sequence by at least 25%, 
yet have similar melting temperatures. The result: cross-hybridization is minimized, if not 
eliminated, even under uniform hybridization conditions. The composite probes and 
products, which contain a target specific portion and a capture specific portion allow for 
accurate target identification and discrimination of closely spaced and overlapping mutations, 
including small insertions and deletions without generating false-positive or false-negative 
signals. In addition, the method has proven to be highly sensitive, capable of detecting low 
abundance mutations in heterogenous clinical samples. This sensitivity permits early disease 
detection, which can be critical for a good disease prognosis. The ability to use the method to 
detect promoter methylation silencing of tumor suppressor genes helps predict disease 
outcome and guide cancer treatment. 

51. I hereby declare that all statements made herein of my own knowledge 
are true and that all statements made on information and belief are believed to be true; and 
further that these statements were made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under section 1001 of 
Title 1 8 of the United States Code, and that such willful false statements may jeopardize the 
validity of the application or any patent issuing thereon. 



Date;_March 27, 2010 



Francis Barany, Ph.D. 
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Health January, 2010 

Chairman of Partnerships for Point of Care (POC) Diagnostic Technologies for 
Nontraditional Health Care Settings Study Section, National Institute of Allergy and 
Infectious Diseases, National Institutes of Health October, 2008 
Co-founder of the New York State Cancer Initiative working committee, 2007 
Member of the Ensemble Scientific Advisory Board 2007 

Member of Scientific Advisory Board, Center for BioModular Multi-scale Systems, LA, 
Oct. 2005 

Chairman of Innovative Technologies for the Molecular Analysis of Cancer Study 
Section, National Cancer Institute, National Institutes of Health November, 2003 July 
2004 

Member of study section reviewing P01, National Institutes of Health, March 2002, 
November 2002 

Member of Site Visit Team reviewing Jackson Labs, National Cancer Institute, National 
Institutes of Health, February 2001 

Member of SBIR study section. National Institutes of Health, July 2000 
Co-organizer, 1998 FASEB Conference "Nucleic Acid Enzymes: Mechanisms and 
Diseases." 

Member of Innovative Technologies for the Molecular Analysis of Cancer Study Section, 
National Cancer Institute, National Institutes of Health November, 1998, July 1999, July 
2000, March 2002 

Member of Novel Technologies for Evaluation of Molecular Alterations in Tissue, and 
Technologies for Generation of Full-Length cDNA Libraries, Special Study 
Sections, National Cancer Institute, National Institutes of Health, July 1997, March 1998. 
Ad-hoc member of Developmental Diagnostics Working Group, National Cancer 
Institute, National Institutes of Health, July 1997. 

Member of Advanced Diagnostics for Pathogens Study Section, Defense 
Advanced Research Projects Agency, June and September 1997. August, 1998. 
Ad-hoc member of Human Genome Study Section, National Institutes of Health 
November 1994 and June 1995. 

Site Visit of National Cancer Institute Early Detection Research Network, June 1995. 

Member of Academic Medicine Development Corporation New York Cancer Project 

Consortia Group (1997-Present) 

Editor of Gene (1987- 1995) 

Editorial Advisory Board Member of Gene (1996) 

Scientific Advisory Board of Amplicon (1995-1998) 

Referee of research articles submitted for Science, Proceedings of the National 
Academy of Sciences, EMBO Journal, Gene, J. Bacteriology, Biochemistry, and 
Nucleic Acids Research. 
Reviewer for NSF grants 



4 



Professional Societies: 

Phi Eta Sigma, 1974 
Sigma Xi, 1980 

American Society for Microbiology, 1990 

American Society for Biocfiemistry and Molecular Biology, 1990 



5 



Patents: 
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patents from Weill Cornell Medical College. The Barany Laboratory patents and intellectual 
property have generated over $28 million in NIH Grants, NiST Grant, Industrial Sponsored 
Research Grants, over $6 million in royalties to Weill-Cornell, and over $1.3 billion in sales or 
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1. Barany, F, Six base oligonucleotide linkers and methods for their use. Licensed to 

Pharmacia P-L Biochemicals, Wl. (U.S. Patent #4,719,179; issued May 1988). 

2. Barany, F., Zebala, J. Nickerson, D., Kaiser, R., & Hood L. Thermostable ligase 

mediatedDNA amplification system for the detection of genetic diseases. Licensed to 
Applied Biosystems/Perkin Elmer Inc., Foster City, CA. Thermostable ligase licensed to 
Roche Molecular Systems, Alameda, CA and New England Biolabs, Beverly, MA (U.S. 
Patent #5,494,810, issued February, 1996; U.S. Patent # 5,830,71 1 , issued November 
1998; and, U.S. patent #6,054,564, issued April, 2000 ). 

3. Barany, F., Barany, G., Hammer, R.P., Kempe, M., Blok, H., & Zirvi, M. Detection of nucleic 

acid sequence differences using the ligase detection reaction with addressable arrays. 
Licensed to Applied Biosystems/Perkin Elmer Inc., Foster City, CA. (U.S. patent # 
6,852,487; issued February, 2005; and U.S. patent # 7,083,917; issued August, 2006). 

4. Barany F., Belgrader, P., & Lubin, M. Detection of nucleic acid sequence differences using 

coupled ligase detection and polymerase chain reactions. Licensed to Applied 
Biosystems/Perkin Elmer Inc., Foster City, CA (U.S. patent #6,027,889, issued February, 
2000; U.S. patent #6,268,148, issued July, 2001; U.S. patent # 6,797,470, issued 
September 2004; 

5. Barany F., Lubin, M., Barany, G., & Hammer, R.P. Detection of nucleic acid sequence 

differences using coupled ligase detection and polymerase chain reactions. Licensed to 
Applied Biosystems/Perkin Elmer Inc., Foster City, CA U.S. patent # 7,097,980, issued 
August 2006; U.S. patent # 7,166,434, issued January 2007; U.S. patent # 7,312,039, 
issued December 2007, U.S. patent # 7,320,865, issued January 2008, U.S. patent # 
7,332,285, issued February 2008, U.S. patent # 7,364,858, issued April 2008, U.S. 
patent # 7,429,453, issued September 2008, U.S. patent # 7,556,924, issued July 2009). 

6. Barany, F., Luo, J., Khanna, M., & Bergstrom, D. High fidelity detection of nucleic acid 
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of colon tumors; Program Director: Barany, F. 

93. Gao H, Huang J, Barany F, Cao W. (2007) Switching base preferences of mismatch 

cleavage in endonuclease V: an improved method for scanning point mutations. Nucleic 
Acids Res. 35(1 ):e2. 

94. Pingle MR, Granger K, Feinberg P, Shatsky R, Sterling B, Rundell M, Spitzer E, Larone D, 

Golightly L, Barany F. (2007) Multiplexed identification of blood borne bacterial 
pathogens using a novel 16s rDNA PCR/LDR/Capillary Electrophoresis assay J Clin 
Microbiol. 45: 1927-1935. 

95. Gavert N, Shaffer M, Raveh S, Spaderna S, Shtutman M, Brabletz T, Barany F*, Paty P*, 

Notterman D*, Domany E*, Ben-Ze'ev A. (2007). Expression of L1-CAM and ADAMIO in 
human colon cancer cells induces metastasis. Cancer Res. 67:7703-12. *Co- 
investigators of an international collaboration on molecular profiling of colon tumors; 
Program Director: Barany, F. 

96. Hashimoto M, Barany F, Xu F, Soper SA. (2007) Serial processing of biological reactions 

using flow-through microfluidic devices: coupled PCR/LDR for the detection of low- 
abundant DNA point mutations. Analyst. 132:913-21. 

97. Forslund, A., Zeng, Z, Qin, L, Rosenberg, S., Ndubuisi, M., Pincas, H., Gerald, W*, 

Notterman D*, Barany F*, Paty P* (2008). Mdm2 gene amplification is correlated to 
tumor progression but not to presence of snp309 or TP53 mutational status in primary 
colorectal cancers. Molecular Cancer Research 6:205-21 1 . *Co-investigators of an 
international collaboration on molecular profiling of colon tumors; Program Director: 
Barany, F. 

98. Bacolod M, Schemmann G, Wang S, Shattock R, Giardina S, Zeng Z, Shia J, Stengel R, 

Gerry N, Hoh J, Kirchhoff T, Gold B, Christman M, Offit K, Gerald W*, Notterman D*, Ott 
J*, Paty P*, Barany F*. (2008). The Signatures of Autozygosity among Patients with 
Colorectal Cancer. Cancer Res. 68:2610-21. *Co-investigators of an international 
collaboration on molecular profiling of colon tumors; Program Director: Barany, F. 

99. Khan SA, Idrees K, Forslund A, Zeng Z, Rosenberg S, Pincas H, Barany F*, Offit K, 

Laquaglia MP, Paty P*. (2008). Genetic variants in germline TP53 and MDM2 SNP309 
are not associated with early onset colorectal cancer. J Surg Oncol. 97:621-5. *Co- 
investigators of an international collaboration on molecular profiling of colon tumors; 
Program Director: Barany, F. 

100. Zeng ZS, Weiser MR, Kuntz E, Chen CT, Khan SA, Forslund A, Nash GM, Gimbel M 

Yamaguchi Y, Culliford AT 4th, D'Alessio M, Barany F*. Paty P*. (2008) c-Met gene 
amplification is associated with advanced stage colorectal cancer and liver metastases. 
Cancer Lett. 265:258-69. *Co-investigators of an international collaboration on molecular 
profiling of colon tumors; Program Director: Barany, F. 
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101. Rondini S, Pingle MR, Das S, Tesh R, Rundell MS, Horn J, Stramer S, Turner K, 

Rossmann SN, Lanciotti R, Spier EG, Munoz J, Larone D, Spitzer E, Barany F, Golightly 
LM. (2008) Development of a multiplex PCR/LDR assay for detection of West Nile Virus. 
J Clin Microbiol. 46(7):2269-79. 

102. Das S, Pingle MR, Mufioz-Jordan J, Rundell MS, Rondini S, Granger K, Chang GJ, Kelly 

E, Spier EG, Larone D, Spitzer E, Barany F, Golightly LM. (2008) Detection and 
Serotyping of Dengue Virus in Serum Samples by Multiplex Reverse Transcriptase- 
PCR/LDR Assay. J Clin Microbiol. 46:3276-84. 

103. Wang S, Haynes C, Barany F*, Ott J*. (2009) Genome-wide autozygosity mapping in 

human populations. Genet Epidemiol. 2009 Feb;33(2):1 72-80. *Co-investigators of an 
international collaboration on molecular profiling of colon tumors; Program Director: 
Barany, F, 

104. Cheng YW, Pincas H, Bacolod MD, Schemmann G, Giardina SF, Huang J, Barral S, Idrees 

K, Khan SA, Zeng Z, Rosenberg S, Notterman DA*, Ott J*, Paty P*, Barany F*. (2008) 
CpG island methylator phenotype associates with low-degree chromosomal 
abnormalities in colorectal cancer. Clin Cancer Res. 2008 Oct 1;14(19):6005-13. *Co- 
investigators of an international collaboration on molecular profiling of colon tumors; 
Program Director: Barany, F. 

105. Sinville R, Coyne J, Meagher RJ, Cheng YW, Barany F, Barron A, Soper SA. (2008) 

Ligase detection reaction for the analysis of point mutations using free-solution 
conjugate electrophoresis in a polymer microfluidic device. Electrophoresis 29(23):4751- 
4760. 

106. Bacolod M, Schemmann G, Giardina S, Paty P*, Notterman D*, Barany F*. (2009) 

Emerging paradigms in cancer genetics: some important findings from high-density 
SNP array studies. Invited Review. Cancer Res. 2009 Feb 1 ;69(3):723-7. *Co- 
investigators of an international collaboration on molecular profiling of colon tumors; 
Program Director: Barany, F. 

107. Sheffer M, Bacolod MD, Zuk O, Giardina SF, Pincas H, Barany F*, Paty PB*, Gerald WL*, 

Notterman DA*, Domany E*. (2009) Association of survival and disease progression with 
chromosomal instability: a genomic exploration of colorectal cancer. Proc Natl Acad Sci 
USA. 2009 Apr28;106(17):7131-6. *Co-investigators of an international collaboration 
on molecular profiling of colon tumors; Program Director: Barany, F. 

108. Nash GM, Gimbel M, Cohen AM, Zeng ZS, Ndubuisi Ml, Nathanson DR, Ott J*, Barany F*, 

Paty PB*. (2009) KRAS Mutation and Microsatellite Instability: Two Genetic Markers of 
Early Tumor Development That Influence the Prognosis of Colorectal Cancer. Ann Surg 
Oncol. 2009 Oct 8. [Epub ahead of print] . *Co-investigators of an international 
collaboration on molecular profiling of colon tumors; Program Director: Barany, F. 

109. Granger K, Rundell MS, Pingle MR, Shatsky R, Larone DH, Golightly LM, Barany F, 

Spitzer ED. (2009) Multiplex PCR-Ligation Detection Reaction assay for the 
simultaneous detection of drug resistance and toxin genes from Staphylococcus aureus, 
Enterococcus faecalis and Enterococcus faecium. J Clin Microbiol. 2009 Oct 28. [Epub 
ahead of print]. 
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110. Cheng YW, Idrees K, Shattock R, Khan SA, Zeng Z, Brennan CW, Paty P*, Barany F*. 
(2009) Loss of imprinting and marl<ed gene elevation are two forms of aberrant IGF2 
expression in colorectal cancer.Int J Cancer. 2009 Dec 2. [Epub ahead of print]. *Co- 
investigators of an international collaboration on molecular profiling of colon tumors; 
Program Director: Barany, F, 
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LECTURES 



1 . Plasmid exchange between Streptococcus pneumoniae and E. coli. Wind River Conference 

of Genetic Exchange, Estes Park, CO, June 9, 1981. 

2. Plasmid exchange between Gram-positive and Gram-negative bacteria. Northwestern 

University Medical and Dental School, Chicago, IL, August 5, 1981. 

3. Single-stranded DNA transformation and gene expression of fl- plasmid hybrids in 

Streptococcus pneumoniae and Esctierictiia coli. ASM International Conference on 
Streptococcal Genetics, Sarasota, FL, November 10, 1981. 

4. E. coli insertion elements that increase expression of heterologous genes. New York Public 

Health Research Institute, New York, December 15, 1981. 

5. Heterologous gene expression in Gram-positive and Gram-negative bacteria. University of 

Illinois at Chicago Circle, Chicago, IL. April 10, 1982. 

6. How DNA eludes restriction during Haemophilus transformation. Pharmacia-P.L. 

Biochemicals, Inc., Milwaukee, Wl, August 17, 1983. 

7. The mechanism of DNA uptake and integration in Haemophilus transformation. Keynote 

speaker at Mid-Atlantic Extrachromosomal Genetics Elements Meeting, Virginia Beach 
VA, October 1, 1983. 

8. Directional transport and integration of donor DNA during Haemophilus transformation. 

Temple University School of Medicine, Philadelphia, PA, June 14, 1984. 

9. Single-stranded hexameric oligonucleotide linkers for in vitro mutagenesis. Helen Hay 

Whitney Foundation meeting, Arden House, Harriman, NY, December 8, 1984. 

10. Two-codon insertion mutagenesis of plasmid genes. University of California, Berkeley, 

Naval Biosciences Laboratory, Oakland, CA, March 7, 1985. 

11. Single-stranded hexameric oligonucleotide linkers: Use for in vitro mutagenesis. 

Mutagenesis workshop UCLA Symposia - Protein Structure, Folding and Design. 
Keystone, CO, April 1, 1985. 

12. Two-codon insertion mutagenesis. Invited speaker at 15th Linderstrom-Lang Conference of 

the Swedish Society for Microbiology and the Swedish Biochemical Society, Umea, 
Sweden, September 23, 1985. 

13. Insertion mutagenesis of bacterial genes. NY Prokaryotic Molecular Biologists, The 

Rockefeller University, NY, May 27, 1986. 

14. Two-codon insertion mutagenesis of the Taq\ restriction endonuclease. New England 

Biolabs, Beverly, MA, December 4, 1986. 

15. Analysis of functional domains using two-codon mutagenesis. Harvard Medical School, 

Boston, MA, December 5, 1986. 
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16. Analysis of protein functional domains using two codon insertion mutagenesis. University 

of Chicago, Chicago, 11. April 14, 1987. 

17. Analysis of functional domains using two codon insertion mutagenesis. Invited speaker at 

Cornell University Biotechnology Symposium: Genetic engineering of proteins. Ithaca 
N.Y. October 20, 1987. 

18. A Tag * is born. New England Biolabs, Beverly, MA, December 10, 1987. 

19. How Taq\ restriction endonuclease recognizes its cognate sequence. University of 

Nebraska, Lincoln NE. February 17, 1988. 

20. How Taq\ recognizes its cognate sequence. Cold Spring Harbor Laboratories Cold Spring 

Harbor N.Y. May 17, 1988. 

21. Sequence specific recognition of DNA by Taq\ endonuclease. Workshop on Biological DNA 

Modification. Gloucester MA, May 21, 1988. 

22. How Taq\ restriction endonuclease recognizes its cognate sequence. Max-Planck-lnstitut 

fur Molekulare Genetik, Berlin West Germany, September 19, 1988. 

23. How Taq\ restriction endonuclease recognizes its cognate sequence. SIBIA, San Diego CA 

November 15, 1988. 

24. How Taq\ restriction endonuclease recognizes its cognate sequence. University of California 

San Francisco, San Francisco CA, November 18, 1988. 

25. How Taq\ restriction endonuclease recognizes its cognate sequence. Abbott Laboratories, 

Chicago II, November 21, 1988. 

26. How Taq\ restriction endonuclease recognizes its cognate sequence. Dupont, Central Res. 

and Dev. Dept. Wilmington, DE, January 25, 1989. 

27. How Taq\ restriction endonuclease recognizes its cognate sequence. Brookhaven National 

Labs. Upton, Long Island, NY, January 26, 1989. 

28. How Taq\ restriction endonuclease recognizes its cognate sequence. Hunter College New 

York, NY, October 27, 1989. 

29. How Taq\ restriction endonuclease recognizes its cognate sequence. University of 

Rochester, Rochester, NY, November 30, 1989. 

30. How Taql restriction endonuclease recognizes its cognate sequence. California Institute of 

Technology, Pasadena, CA, January 25, 1990. 

31. Detection of genetic diseases using thermostable DNA ligase. Genome Mapping and 

Sequencing Conference, Cold Spring Harbor, NY, May 3, 1990. 

32. The exquisite specificity of Thermus aquaticus DNA recognition proteins. University of 

Maryland at Baltimore, Baltimore, MD, May 14, 1990. 
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33. The exquisite specificity of Thermus aquaticus DNA recognition proteins. Johns Hopi<ins 

Universtiy School of IVIedicine, Baltimore, MD, May 15, 1990. 

34. The exquisite specificity of Thermus aquaticus DNA recognition proteins; Correlations 

between codon insertion mutants, revertants, and the three dimensional structure of fi- 
lactamase. International Symposium on Site-Directed Mutagenesis and Protein 
Engineering, (DNA/Protein interactions session Chairman). Tromso, Norway Auaust 
29 8.30,1990. 

35. TagI endonuclease insertion mutants and a comparison of its sequence with the TthHB8\ 

isoschizomer. Invited speaker at Second New England Biolabs Workshop on Biological 
DNA Modification. West Berlin, Germany, September 3, 1990. 



36. 



Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 
diseases. Research Institute of Molecular Pathology, Vienna, Austria September 7 
1990. ^ 



37. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. University College and Middlesex School of Medicine. London England 
September 10, 1990. 

38. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Stratagene. San Diego, CA, November 16, 1990. 

39. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Cetus. Emeryville, CA, November 19, 1990. 

40. The exquisite specificity of Thermus aquaticus DNA recognition proteins. College of 

Physicians & Surgeons of Columbia University. New York, NY, January 18, 1991. 

41. Single-nucleotide genetic disease detection using cloned thermostable ligase. Invited 

speaker at Miami/Biotechnology winter symposia. Advances in gene technology; The 
molecular biology of human genetic disease. Miami, FL, January 31 , 1 991 . 

42. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Mount Sinai Medical School. New York, NY, February 5, 1 991 . 

43. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. University of Massachusetts Medical Center. Worcester, MA February 22 
1991. 



44. Detection of genetic diseases. 50th Westinghouse Science Talent Search Alumni 

Reunion. Washington DC. March 3, 1991. 

45. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Public Health Research Institute, New York. April 2, 1991. 

46. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Cornell University, Ithaca, NY, April 25, 1991. 

47. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Centers for Disease Control, Atlanta, GA, May 14, 1991. 
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48. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Rocl<y Mountain Laboratory, Hamilton, MT, August 16, 1991, 

49. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. BioRad, San Francisco, CA, November 25, 1991. 

50. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Applied Biosystems Inc. Foster City, CA February 18, 1992. 

51. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Max-Planck-lnstitut fur Molel<ulare Genetik, Berlin, Germany, March 23, 1992. 

52. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Institut fur Mikrobiologie und Molekuiarbiologie, Giessen 
Germany, March 26, 1992. 

53. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. Institut Pasteur, Paris, France, March 31, 1992. 

54. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. University of Leicester, Leicester, England, April 3, 1992. 

55. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. University of Bristol, Bristol, England, April 7, 1992. 

56. Thermus aquaticus DNA recognition proteins, and their use for detection of genetic 

diseases. SUNY at Stonybrook, Stonybrook, NY, April 16, 1992. 

57. Single-nucleotide disease detection using ligase chain reaction. Invited speaker at the 92"*^ 

American Society for Microbiology Meeting, New Orleans, LA, May 29, 1992. 

58. The ligase chain reaction (LCR) for detection of mutations. Invited speaker at Seventh 

annual workshop on recent advances in molecular pathology, Tufts University School of 
Medicine, Boston, MA, June 19, 1992. 

59. Thermophilic DNA recognition proteins, and their use for detection of genetic diseases. 

University of Illinois College of Medicine, Chicago, IL, November 5, 1992. 

60. New concepts in cancer detection. Applied Biosystems Inc. Foster City, CA, March 25 

1993. 

61. Genetic disease detection. Roche Molecular Systems. Alameda, CA, March 26, 1993. 

62. Detection of genetic diseases. University of Illinois at Chicago, Chicago, IL, April 30, 1993. 

63. Detection of genetic and infectious diseases using DNA diagnostics. Invited speaker at; 

PCR: Alternative technologies and applications, Boston, MA, June 7, 1993. 

64. A biochemical analysis of the Taq\ restriction endonuclease. Invited speaker at; Restriction 

endonucleases and modification methyltransferases: Structures and mechanisms, 
FASEB research conference, Saxton River, VE, July 6, 1993. 
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65. Genetic disease detection. Strang Cancer Research Laboratory, New York, NY, December 

13, 1993. 

66. Genetic disease detection. Skirball Institute of Biomolecular iVledicine, New York, NY, 

February 17, 1994. 

67. New methods of detecting genetic diseases and cancers. Perkin Elmer/Applied Biosystems 

Foster City, CA, March 17, 1994. 

68. New methods of detecting genetic diseases and cancers. Dean's Hour Lecture. Cornell 

University Medical College, New York, NY, March 23, 1994. 

69. New methods of detecting genetic diseases and cancers. Yale University School of 

Medicine, New Haven, CT April 27, 1994. 

70. New methods of detecting genetic diseases and cancers. University of Maryland, Baltimore 

MD, May 2, 1994. 

71. New methods of detecting genetic diseases and cancers. Wayne State University, Detroit 

Ml May 9, 1994. 

72. New methods of detecting genetic diseases and cancers. Memorial Sloan Kettering Institute, 

New York, NY December 8, 1994. 

73. New methods of detecting cancers. Oncor, Gaithersburg, MD, January 5, 1995. 

74. New methods of detecting genetic diseases and cancers. Myriad, Salt Lake City 

UT, February 6, 1995. 

75. New methods of detecting genetic diseases and cancers. City of Hope Beckman Research 

Center, Duarte, CA, February 7, 1995. 

76. New methods of detecting genetic diseases and cancers. Applied Biosystems Division of 

Perkin Elmer, Foster City, CA, February 8, 1995. 

77. New methods of detecting genetic diseases and cancers. National Cancer Institute 

Rockville, MD, March 8, 1995. 

78. Detecting genetic diseases and cancers. National Institutes of Health, Rockville, MD, March 

9, 1995. 

79. New methods of detecting genetic diseases and cancers. Cold Spring Harbor Laboratories 

Cold Spring Harbor, NY April 3, 1995. 

80. New methods of detecting genetic diseases and cancers. Invited Speaker at "Accelerating 

Gene Discovery and Mutation Detection" conference, New York, NY, May 16, 1995. 

81. New methods of detecting genetic diseases and cancers. Memorial Sloan Kettering Institute 

New York, NY, May 31 , 1 995. 
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82. New methods of detecting genetic diseases and cancers. Wadsworth Center for 

Laboratories and Research, Albany, NY, June 6, 1995 

83. New methods of detecting genetic diseases and cancers. Memorial Sloan Kettering Institute 

New York, NY June 7, 1995. 

84. New methods of detecting genetic diseases and cancers. National Cancer Institute, Early 

Detection Research Network Site Visit, Rockville, MD, June 8, 1995. 

85. New approaches to cancer detection. Strang Cancer Prevention Center Annual Meeting 

New York, NY, June 9, 1995. 

86. New methods of detecting genetic diseases and cancers. Invited speaker at "Applications of 

Diagnostics in Health Care, the Environment, and Agriculture." Cornell University Ithaca 
NY, October 9, 1995. 

87. New methods of detecting genetic diseases and cancers. Yale University Medical 

School, New Haven, CT, November 21, 1995. 

88. New methods of detecting genetic diseases and cancers. Chiron, Emmeryville CA 

December 5, 1995. 

89. Vistas in genetic analysis. University of Illinois at Chicago. "Biochemistry and 

Pathophysiology of Muscle. A symposium in tribute to Professors Michael and Kate 
Barany." Chicago, II, May 13, 1996. 

90. New methods of detecting genetic diseases and cancers. Motorola, Phoenix, Az May 20 

1996. 

91. DNA recognition proteins and their use in detecting genetic diseases and cancers. Invited 

speaker at "Enzymes that act on Nucleic Acids" FASEB research conference Saxton 
River, VE, June 20, 1996. 

92. Development of programabie DNA arrays. Progress report on the ATP/ NIST joint project. 

Applied Biosystems Division of Perkin Elmer, Foster City, CA, August 15, 1996. 

93. (i) Improving the fidelity of thermostable DNA ligase. (ii) Multiplexed detection of K-ras 

mutations in colorectal cancer. Applied Biosystems Division of Perkin Elmer, Foster 
City, CA, August 16, 1996. 

94. New methods of detecting genetic diseases and cancers, Johnson & Johnson Diagnostics 

Rochester, NY, August 26, 1996. 

95. New methods of detecting genetic diseases and cancers. Merck Research Laboratories 

West Point, PA, October 10, 1996. 

96. Early detection of colon cancer mutations. New York Human Genetics Club, Columbia 

University College of Physicians & Surgeons, NY, October 17, 1996. 

97. DNA recognition proteins and their use in detecting genetic diseases and cancers. The 

Institute of Genomic Research, Rockville, MD, October 18, 1996. 
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98. New methods of detecting genetic diseases and cancers. City of Hope Beckman Research 

Center, Duarte, CA, October 23, 1996. 

99. Multiplexed detection of genetic and forensic polymorphisms. Motorola, Schaumburg IL 

November 7, 1996 

100. New methods of detecting genetic diseases and cancers. Perkin Elmer Wilton CT 

November 18, 1996. 

101. New methods of detecting genetic diseases and cancers. Novel Amplification 

Technologies (Chairman: Update on the latest Amplification Technologies), Washington 
D.C. December 16, 1996. ' 

102. New approaches to detecting repeat sequence polymorphisms associated with colorectal 

cancer. Motorola, Schaumburg, IL , April 11, 1997. 

103. New approaches to detecting point mutations and repeat sequence polymorphisms 

associated with spontaneous colorectal cancer. Applied Biosystems Division of Perkin 
Elmer, Foster City, CA, April 24, 1997. 

104. Fidelity and Error in Thermophilic DNA-Recognition Proteins. The Rockefeller University 

New York, NY, January 1 3, 1 998. 

105. New Methods of Cancer Detection. Yale University Medical School New Haven CT 

January 27, 1998. 

106. Thermophilic DNA recognition proteins, and their use for detection of cancer-associated 

mutations. . Co-organizer and speaker at "Nucleic Acid Enzymes: Mechanisms and 
Diseases " FASEB research conference, Saxton River, VE, June 16, 1998. 

107. New Methods of Cancer Detection. Abbott Labs, Chicago, II, October 22, 1998. 

108. New Methods of Cancer Detection. Columbia University College of Physicians & 

Surgeons, NY, October 26, 1998. 

109. New Methods of Cancer Detection. Mount Sinai Medical School, NY, October 27, 1998. 

110. New Methods of Cancer Detection. Public Health Research Institute, NY November 3 

1998. 

111. Around the Genome in 80 days. Celera, Rockville, MD. January 7, 1999. 

112. New Methods of Cancer Detection. National Cancer Institute. Bethesda, MD March 25 

1999. 

113. New Methods of Cancer Detection. Institute of Biotechnology, San Antonio, Tx. April 6, 

1 999. 

114. New Methods of Cancer Detection. University of Texas Austin, Austin, Tx. April 7, 1999. 

115. Cancer Detection from Clinical Samples: Future Challenges for Nanofabricated Devices. 

Invited speaker at Nanofabricated Devices Conference, San Jose, CA, April 20, 1999. 
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116. Universal DNA arrays and new enzymes for cancer detection. PE-Biosystems, Foster City 

CA, April 21, 1999. 

117. New Methods of Cancer Detection. National Cancer Institute. Bethesda, MD April 28 

1999. 

118. New Methods of Cancer Detection. Institute Pasteur. Paris, France. May 31, 1999. 

119. New Methods of Cancer Detection. Karolinska Institute. Stockholm, Sweden. June 1, 1999. 

120. New Methods of Cancer Detection. Institute Curie. Paris, France. June 2, 1999. 

121. Multiplex detection of cancer mutations using ligase based assays: application to p53, K- 

ras, met oncogene, APC, BRCA1, and BRCA2. Invited speaker and session chair at 
Eurocancer99, Paris, France, June 3, 1999. 

122. Use of DNA recognition proteins to identify genome changes in cancers. Invited speaker 

at "Nucleic Acid Enzymes: Structures, Mechanisms and Novel Applications" FASEB 
research conference, Saxton River, VE, June 21, 2000. 

123. New Methods of Cancer Detection. Institut de Genetique Moleculaire. Montpellier France 

August 28, 2000. 

124. New Methods of Cancer Detection. Institute Curie. Paris, France. August 29, 2000. 

125. New Methods of Cancer Detection. Invited Keynote Speaker at "Arrays and Beyond" 

Wadsworth Center, Albany, NY, December 5, 2000. 

126. New Methods of Cancer Detection. Distinguished Speakers Seminar Program, National 

Cancer Institute, Frederick Cancer Research and Development Center. Ft. Detrick MD 
December 12, 2000. 

127. New Methods of Cancer Detection. Invited Speaker at American Society for Clinical 

Oncologists (ASCO). San Francisco, CA. May 12 and 14, 2001. 

128. New Methods of Cancer Detection. Invited Speaker at Chips-to-Hits IBC meeting. San 

Diego, CA. November 1, 2001. 

129. Molecular profiling of colon tumors. Applied Biosystems, Foster City, CA February 19 

2002. • / > , 

130. New Methods of Cancer Detection. University of South California, Los Angeles CA 

February 21 , 2002. 

131. New Methods of Cancer Detection. Purdue University, West Lafayette, IN March 18, 2002. 

132. New Methods of Cancer Detection. Invited Speaker at Lennox K. Black Symposium - 

"Genomics & Bioinformatics For The Advancement Of Clinical Science" Philadelphia 
PA, Oct. 13, 2002. ' 
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133. New Methods of Cancer Detection, Invited Speaker at American Association of Cancer 

Researliers (AACR). Frontiers in Cancer: Prevention Research, Boston MA Oct 14 
2002. ■ ' 

134. New Methods of Cancer Detection. MIT, Boston, MA Oct. 17, 2002. 

135. Potential Benefits of Molecular Profiling to Cancer Diagnosis. Plenary Address, Invited 

Speaker, at Chips-to-Hits IBC meeting. Philadelphia, PA. October 30, 2002. 

136. Molecular Profiling of Tumors. Grand Rounds, Dept. of Pathology, Weill Medical College of 

Cornell University, New York, NY, November 25, 2002. 

137. Harmonized Microarray / Mutation Scanning Analysis of Colorectal Tumors. Invited 

Speaker and Session Chair. 7th Mutation Detection Workshop, Palm Cove Queensland 
Australia July 3, 2003. 

138. Harmonized Microarray / Mutation Scanning and Methylation Analysis of Colorectal 

Tumors. Keynote Address. BioArrays-2003-New York, New York, NY October 1, 2003. 

139. Multiplexed pathogen detection for Biodefense. Applied Biosystems, Foster Citv CA 
February 18, 2004. j" - 

140. Molecular profiling of colon tumors. Louisiana State University. Baton Rouqe LA Mav 13 

2004 ' ' 

141. Molecular profiling of Cancer. Invited Keynote Speaker. Pharmocogenomics / 
Toxicogenomics Johnson & Johnson Symposium, New Brunswick, NJ September 29 
2004. 

142. Molecutar profiling of colon tumors. Applied Biosystems, Foster City, CA, October 25, 

143. Molecular profiling of tumors. Keynote Address at Clinical Genomics IBC conference San 

Diego, CA, February 17, 2005. 

144. Molecular profiling of colon tumors. National Cancer Institute, Washington, DC, March 16, 

2005. 

145. Molecular profiling of tumors. Invited Speaker at Technology Fair 2005. United States 

Patent Office, Washington, DC, March 17, 2005. 

146. Molecular Detection and Diagnosis: The role of detection and rapid diagnosis in treating 

infectious disease. Invited Speaker at National Academy of Sciences Workshop on "New 
directions in the study of antimicrobial therapeutics: New classes of antimicrobials" 
Washington, DC, March 24, 2005. 

147. Molecular profiling of cancers, Weizmann Institue of Science, Rehovot, Israel September 

4, 2005, 

148. Molecular techniques for tumor investigations. Invited speaker, British Human Genetics 

Conference, York, England, September 14, 2005. 
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149. Molecular profiling of cancers. University of Notre Dame, Notre Dame, IN, October 10, 



150. Multiplexed detection of biothreat agents. New York Department of Public Health NY 

January 27, 2006. 

151. Molecular profiling of cancers. Celera Diagnostics. Alameda, CA, February 21, 2006. 

1 52. Molecular profiling of cancers. Roche Molecular Systems. Alameda, CA February 21 

2006. ' 

153. Multiplexed detection of blood-borne pathogens. Cepheid. Sunnyvale, CA, February 22, 



154. Molecular profiling of colon tumors. Affymetrix. Santa Clara, CA, February 22, 2006. 

155. Multiplexed Blood-Borne Pathogen Identification and Detection. National Institute of Allergy 

and Infectious Diseases, Washington, DC, August 17, 2006. 

156. Molecular profiling of colon tumors. Invited Speal<er at Chips-to-Hits IBC meeting Boston 

MA. September 27, 2006. 

157. Molecular profiling of colon tumors. Grand Rounds, Dept. of Pathology, Weill Medical 

College of Cornell University, New York, NY October 9, 2006. 

158. Molecular profiling of colon tumors. Grand Rounds, Dept. of Pathology, College of 

Physicians & Surgeons of Columbia University. New York, NY, October 17, 2006. 

159. Molecular profiling of colon tumors. Invited Speaker: Colon Cancer Initiative Meeting. The 

Ludwig Institute for Cancer Research - Hospital A. Oswaldo Cruz, Sao Paolo Brazil 
February 21, 2007. 

160. Molecular profiling of colon tumors. Cepheid, Sunnyvale, CA, June 25, 2007. 

161. Molecular profiling of cancers. Ensemble, Boston, Sept. 17, 2007. 

162. Molecular profiling of colon tumors. Invited Speaker at AACR Special Conference: 

Advances in Colon Cancer Research meeting. Boston, MA. November 17, 2007. 
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Abstract 



Background! The high binding specmcity of short 10 to 30 mer ollgotiu<;leotlde probes enables sineli 
base mismatch (MM) discrimination and thus provides the basis for genotyping and resequenclnr 
microarray applications. Recent experiments Indicate that the underlying principles governini. DNA 
mlcroarray hybridisation - and In particular MM discrimination - are not completely understood 
Microarrays usually address complex mixtures of DNA targets, In order to reduce the level of complexity 
and to study the problem of surface-based hybridization with point defeas In more detail, we performed 
array based hybridization experiments in well controlled and simple situations, 

Rejulti: We performed microarray hybridization experiments with short (6 to mer target and probe 
lengths (in situations without competitive hybridization) In order to systematically Investigate the impact 
of point-muiations - varying defect type and position - on the oligonucleotide duplex binding affinity The 
influence of single base bulges »nd single base MMs depends predominantly on position - It is largest in the 
middle of the strand. The position-dependent influence of base bulges Is very similar to that of single base 
MMs, however certain bulges give rise to an unexpectedly high binding affinity. Besides the defect (MM or 
bulge) type, which is the second contribution In importance to hybridization affinity, there is also a 
sequence dependence, which extends beyond the defea next-neighbor and which is difficult to quantify 
Direct comparison between binding affinities of DNA/DNA and RNA/DNA duplexes shows, that RNAJ 
UNA purine-purine MMs are more discriminating than corresponding DNA/DNA MMs In DNA/DNA 
MM discrimination the affected base pair (C O vs. A T) is the pertinent parameter. We attribute these 
differences to the different structures of the duplexes (A vs. B form). 

Concfuslom We have shown that DNA microarrays can resolve even subtle changes in hybridization 
affinity for simple target mixtures. We have further shown that the impact of point defects on 
oligonucleotide stability can be broken down to a hierarchy of effects. In order to explain our observations 
we propose DNA molecular dynamics - in form of zipping of the oligonucleotide duplex - to play an 
important role, r / ' 
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Background 

DNA microarray technology relies on the highly specific 
binding affinity of surface-iethered DNA probe sequences 
to complementary largei sequences. Nucleic acid hybridi- 
zation, the sequential base pairing between complemen- 
tary probe and target strands, results in the formation of 
stable double-stranded duplexes. In microarray hybridiza- 
tion assays single-stranded nucleic acid targets - con- 
tained in a complex mixture of diffierent target sequences 
in solution - freely diffiise over the surface ieihered 
probes until they are captured by a complementary probe. 
Target strands often carry fluorescent dye labels to enable 
quantitative detection of the individual target species. 
Hybridized targets can be identified by the position of the 
conesponding microarray features (each containing one 
particular species of surface-iethered probe strands) 
within the regular grid of the DNA microarray, 

In DNA microarray applications, along with a high bind- 
ing affinity (providing sensitivity), a high specificity of 
probe-target hybridization is required to discriminate 
between sometimes very similar homologous sequences. 
Binding specificity is panicularly important in genotyping 
applications where Single Nucleotide Polymorphisms 
(SNPs), genetic vanations of single bases, are concerned. 
SNPs determine genetic individuality, but also predisposi- 
tion to a variety of genetic diseases, response to drugs, 
pathogens, chemicals and other agents SNPs are of great 
interest not only for genetic research but also for medical 
diagnostics and therapy 1 1,2] 

SNPs and point-mutations can be detected by means of 
relatively shon 1 0 to W mer probes Already a single mis- 
matched (MM) base pair can resuh in a significant 
decrease of the duplex binding affinity with respect to the 
corresponding perfen matching (PM) duplex |3| 



The binding affinity of mismatched duplexes - in bulk 
solution - is commonly predicted on the basis of the near- 
est-neighbor model |4-6| A recent study by Pozhiikov et 
al. |7| revealed a poor correlation between predicted 
duplex binding affinities and aaual hybridization signal 
intensities implying that the thermodynamic properties of 
oligonucleotide hybridization on DNA microarrays are by 
far not understood In DNA microarray expetiments the 
binding afTiniry of mismatched oligonucleotide duplexes 
is governed not just by nearest-neighbor parameters - as 
In solution-phase hybridization - but mainly by the posi- 
tion of the defect |7-I0|. Furthermore, the secondary 
strvcture of the long target strands 1 1 1 1 and various sur- 
face effeas j 12 ) have a significant influence on the micro- 
array binding affinity 

Our study is a comprehensive approach to understand 
how point defects affect the hybridiwiion of fluorescently 
labeled oligonucleotide targets to surface-bound oligonu- 
cleotide probes. Rather than previous work on single base 
MMs, which has been conduaed with complex target mix- 
tures either from PGR products |9) or in i/i(ro transcripts 
|7), we employ shon (20-37 ni), end-labeled oligonucle- 
otide targets, thus avoiding labeling and steric hindrance 
related effeas In order to avoid competitive binding 1 1 3| 
we perform each hybridization assay with a single target 
sequence. Oligonucleotide target sequences (DNA and 
RNA - see Tab, 1) were chosen to minimize secondary 
struaures and any related influence on the hybridization 
signal. In panicular we investigated differences between 
the impaa of defects on DNA/DNA and analogue RNA/ 
DNA duplexes. 



DNA chips were fabricated by lighi-direned in situ synthe- 
SIS |I4,15) with a digital micromirror device (DMD" 
Texas Instruments) based maskless synthesis apparatus 
1 1 fi-2 J I developed in our laboratory ) 1 0|, Sets of probe 
sequences were derived from probe sequence motifs by 
T»ble l:FluorMctntty libtled Urjet olljonuclBotWsf u»d In thi. study. 
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DNA 
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DNA 
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DNA 


PET 


DNA 
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DNA 


COM 


DNA 


NCO 


DNA 
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RNA 


IBE 


UNA 


COM 


RNA 



Tir^Bl sequence (S'-»3') 



ACTACAAACTTAGAGTCCAC, 

CAGAGGGGACTGGAATTC 
ACTCCCAACCACCACCCTATCA 
GTGATGCTTGTATGGAGCAA 
...TACTGCGATT 

ACATCAGTGCCTGTGTACTACCAC 
ACCGAACTCAAAGCAAAGAC 
AACTCGCTATAATGACCTGGACTG 
TAGTGGGAGTTCTTACTGATGTGA 
ACAUCAGUGCCUGUGUACUACGACj^ 
GUCAUGCUUGUAUGGAGCAA 
UACUCCGAUUCGAU 

AACUCGCUAUAAUGACCUGGACUG 
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Lenph (ni) 


S'-CyJ 


38 




22 


r-Cy} 


30 


y-Cyl 


21 


y-Cy} 


20 


S'-Cy3 


Zl 


3'-Cy3 


2< 


S'-Cy) 


2S 


5'-CyJ 


31 


5'-C^3 


21 



Huorescently labeled DNA and RNA target olijonucleoudes. 
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systematic variation of defect type and defect position 
including all single base mismatches, insenions and dele- 
tions. The design of the hybridization experiments (Fig. 1 ) 
enables discrimination between the strong Influence of 
defect position |7,9,10| and the more subtle defect-type 
and sequence related factors. 

After the current anicle was submitted, we became aware 
of further related studies in this area, Suzuki ei al. (22) per- 
formed hybridization on custom NimbleExpress" arrays 
(Affymeirix Inc ) lo investigate ihe influence of the probe 
length and mismatch position on single base MM discrim- 
ination Fish et al |23| performed a direct companson 
between hybridization signals (perfealy matching and 
mismaiched duplej<es) from spotted microarrays and 
measured thermodynamic melting parameters (deter- 
mined by differential scanning calorimelry in bulk solu- 
tion) They repon a linear relation between the duplex free 
energy and ihe microarray hybridization inlensiry. 

The focus of the present paper Is on the Impact of various 
defea rypes (single base mismatches and single base 
bulges) on the hybridization signal. 

Results 

Our microarray hybridization experiments performed in 
this study provide quantitative information on the bind- 
ing affinity of individual mismatched duplexes by means 
of the hybridization signal intensity (fluorescence of 
hybridized targets) Since the absolute hybridization sig- 
nal intensities of the different sequence motifs employed 
in this study (Tab, 1 ) are subject to a large variation (often 
larger than between mismatched and corresponding PM 
hybridization signals) we compare the MM hybridization 
signals with the corresponding PM hybridization signals. 
Our experiments - experimental details (probe sets, 
hybridization signal normalizaiion etc ) are explained in 
the Methods section - provide a measure for the mis- 
match discrimination with respea lo the corresponding 
PM binding affinity, rather tiran an absolute measure for 
the MM binding afTinity The discrimination between the 
hybridization affinity of point-mutated probes and corre- 
sponding peri'ect matching probes depends on (he stabil- 
ity of the particular probe sequence In agreement with 
p2| we observed that the (more stable) 25 mer probes are 
less discriminative with respea to point defects than the 
shoner 16 mer probes Discrimination is also reduced for 
sequence motifs stabilized by a higher CG-comeni, 

MM defect position ond hybriaization affinity 

The "defect profile' plots (plots of the normalized hybrid- 
ization signal vs, defect position - e g in Fig, 2) show that 
the dominant parameter determining oligonucleotide 
probe-iarget-affinity - on the microanay surf'ace - la the 
position of the defea A moving average evidences a 



trough-like "mean profile" curve [solid black line in Fig 2) 
A parabolic fit can provide a reasonable approximation 
for the average position dependence obtained from a large 
number of different sequence motifs |7,9) For 16 mer 
duplexes a single base mismatch in the center typically 
results in 40% of the PM hybridization signal However, 
for individual sequence motifs we found sequence- 
dependent deviations from the simple position depend- 
ence (see Fig 3). The raw signal intensities and probe/tar- 
get sequences of the experiment are given in Additional 
file 1 

(nffuenco of tht m/smotc/i type In DNA/DNA duphxm 

In the following we use the notation of the mismatch base 
pairX - y consisting of the mismatched base X in the probe 
sequence and the base Y In the target sequence To inves- 
tigate how the panicular MM-types X Y affea duplex sta- 
bility we measured probe target-affinides for 25 different 
sequence motifs Microarray hybridization experiments 
with single base mismatch probe sets as well as the extrac- 
tion of their hybridization signals, which reflea duplex 
stability, are described in more detail in the Methods sec- 
lion. Owing to the limited number of available target oli- 
gonucleotides we restricted base subsiltutions lo the 
probe sequences. The PM hybridization signals of the dif- 
ferent 16 mer sequence motifs display a strong variaiion 
(up to a factor 20). Since the relative hybridization signal 
intensities within the individual probe sets are largely 
unaffeaed by this variation, we normalize the "defen pro- 
files" by division with their standard deviation The result- 
ing database comprising normalized hybridization 
signals from about 1000 different single MM probe 
sequences, enables categorization of the binding affinities 
according to the mismatch rype. 

For statistical analysis of MM type and nearest-neighbor 
influences the superposed positional influence needs lo 
be eliminated by subtraaion of the mean profile. The 
resulting position-independent defect profile (for simplic- 
ity we keep the term "defect profile") consisting of influ- 
ences of defect type and defecl neighborhood only is 
shown in Fig. 2B, The boxplot representation of this data 
in Fig, 4 demonstrates that MM-types affeaing C-C base 
pairs (i.e. A C, C C, T C and A ■ G, G - C, T C) sysiemai- 
ically have lower median hybridization signal values than 
MM-types affecting A T base pairs (A - A, C A, C ■ A and 
C-T G TT T). 

We compared the MM-type related hybridization signal 
deviations from the mean MM profiles (Fig 2B) to 
predicted Cibbs free energy differences 
SAC'„ = liG;,^^ - &C'„fM between MM and corre- 
sponding PM duplexes. d&Clj were determined from 
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3 ■ -TATTACTGGACCTGAC- 5 ' Probe sequence motir 

(complemenlary ig a 
i , saelion ol the larsel) 



Set 0I point-mutated probe sequences, 
derived from common probe sequence motll 

3 ' - AATT ACTGGACCTOAC - 5 ' , 

3'-CATTACTGOACCTGAC-3' V Slnsle tlsie mismalgfi (MM) pfoMl 
3 ' -GATTACTGGACCTOAC-5 ' ' 

D«t»ct posMlon 1 J'-TAATTACTOGACCTOAC-S' 

3 ' -TCAITACT5GACCTSAC-5 ' [ 

3 ' -TCiATTACTCOACCTGAC-S ' | ^'"8'* msortion proDss 
3'-TTATTACTGGACCTOAC-5' 

3'- ATTACTGGACCTGAC-5' SmBle Daw dstelon pJOOa 



3 ' -T ATTACTOOACCTGAC- 5 ■ 
3 ' -TCTTACTGOACCTGAC-S ' 
3 ' - T'.-TTACTGaACCTCAC- 5 ' 
„ ^ 3 • -rr-TTACTSOACCTGAC-S • 

Det»cl potltlon 2 S'-TAATTACTGGACCTOAC-S' 
3 ■ -TACTTACTGGACCTGAC-S ' 
3 ' -TAOTTACTGGACCTGAC-S ' 
3 ' -TATTTACTCGACCTGAC-S' 
3 '-I TTACTGCACC7GAC-5' 

/ 



Feature arrangement on 

the mlcroarra/ Hybridization signals 




Oelect position 

Figure I 



quantitative ^naly,,, probe sequenfe are a™^^^ the m^croar y JVc r'" ^ M^t" ""'^ ^= 

hybndiration with the target sequence a e S ver/u 'dX™^ "^'^ Hybridisation signal, from 

tie, depending on the probe sequence mot, detc, ^ : and de^^^^^^^^^^ hybridization affini- 
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Figure 2 

H'lZfr!!" ''Kf'J^^ (hybridization ,lgnal y,. defect base poiition) obtained from the hybridisation sig- 

Tect on, wr/r,r ^'"^ kT." of Additional tile 9 Solution-background correction (.ee Method? 

section) w» applied on raw hybndimion signal intensrtlev The probe sequence motif 3'-TATTACTGGACCTGAC-5' ,s com- 
p ement^ry to the target o igonucleotide COM. Marker, depict the substii.ent ba« type (A red cro„«?c Jeen circles G 
^V^V T T'^^'W^l "'"^J: '-""^'"S average of ,« ml.r^aXhybrid^on . Inaif 
over pom.on5 p . 2 to p * 2y PM probes, Included as control to detea erroneous bias, have the largest hybridization s gnal (at 

J^ trrl^X V ^^"^ 'f''"'°" P^"^'^" ""-"^ error of the m asu .* 

r fr" "il^roarray features, due to gradient effects, are expected to be larger than errors between the 

compactly arranged features corresponding to a particular defect position, (B) Deviation profile. The strong positton depend 
ent component of the hybrldlmlon signal Is eliminated by subtraction of the mean profile, (C) Compar son I f ^an mtma.ch 
hybridization s^nals (average of the three mismatch hybridization signals at a particular defea position) at the iret of C G b se 
pa^s to mean MM hybridization signals at the site of ad|acent A T base pairs, A marker (red st^r: A T; b"e dr le C G) is se n 
the upper row .f the hybridization signal of the mismatches at the corresponding site is higher than at the adjacent site other" 
Ztr hT T '^i' ^T" -"'"^"ches subst,tuting% C^G base pair usually have sysc mat ca v 

lower hybridization signals than mismatches substituting a neighboring A T base pair >ys«maiicaiiy 
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Figure 3 

The impact of defects Is affected by the local sequence environment. Single base insertion profiles (hybridization sig- 
nal plotted versus the insertion base position) of four 25 mer probe sequence motifs complementary to the same urget URA, 
Following solution-background correction of the raw intensity data (Methods section) hybridiiation signals were normaliied 
with respect to the largest hybridization signal in each of the four Insertion proHles. The probe motifs I to 4 hybridize at differ- 
ent sections of the target oligonucleotide, Mean profiles (thick lines) were obtained from the moving average of the particular 
insertion prodles (particular hybridization signal are shown as faint symbols - profile 4 is shown In detail in Fig, 5A). The mean 
profiles I to 3 have a distinct minimum between base positions 1 5 to 20, The stabilizing CG-rich region following after base 
position 20 results in increased hybridization signals in profile 4, 



mismatch nearest-neighbor thermodynamic parameters 
|4) Our analysis (shown in Addillonal file 2) indicates a 
decreasing trend of the (»„p values with increasing d&C'j, . 
Ivloreover, we observed ihal single base mismatches with 
Iwo A T flanking base pairs lend lo provide a becier mis- 
match discriminalion ihan mismalches flanked by two 
C C base pairs 

DNAIDNA sintle base bufge defects 

Single base insenions and deletions owing lo a surplus 
unpaired base in one of ihe (wo sirands result in bulged 
duplexes. In our expenmenls (sequence data and hybridi- 
zation signal raw daia is provided in Addiiional file 3) the 
bulged base is located on the surface-bound probe strand, 
whereas in duplexes with single base deletions (on the 
probe sequence) the bulge is on the target strand. 

We discovered that on average the positional dependence 
of the \riserUon and deletion defect profiles (eg in Figs, 3 and 
5A) is very similar lo the positional dependence of mij- 
matc^i defect profiles (Fig. 2) Within one and the same indi- 
vidual defect profile, single base bulge defeas originating 
from single base insenions or deletions display the same 
positional dependence as single base mismatch defects 



(direa comparison shown in Fig. 6 - hybridization signal 
data provided in Additional file 4), qualitatively as well as 
quantitatively. On average single base in.senion probes 
provide increased hybridization signals when compared 
to MM probes orsingle base deletions (Fig. 7) Besides ihe 
significantly increased hybridizalion signals of Crovp II 
insertions (see below), this is due lo the reduced number 
of binding base pairs in the mismatched duplexes (which 
have one binding base pair less than the PM duplex, 
whereas a single base insenion leaves ihe numberof bind- 
ing base pairs unchanged) In single base insenions no 
binding base pair is substituted, but we see that the influ- 
ence of the insened base clearly depends on its neighbor 
The individual curves (e g die curve of C insenions - 
green circles in Fig. 5) show deviations from the (moving 
average) mean profile, hybridizalion signals can be .signif 
icantly increased over several consecutive defect positions 
In particular base insertions nexi to identical bases (so 
called Croup II bulges (24J) result in systematically 
mcreased binding affinities - in comparison lo insertions 
of non-ideniical bases [Group I bulges) Croup II bulges 
located near the center of 16 mer probes often show 
hybridization signals with a similar intensity as the corre 
spending PM probe (Fig 6, Fig. 5C). A statistical analysis 
with a large daiaset (Fig. 8) comprising hybridizalion aig- 
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Figure 4 

Boxplot repre$entfttlon of the hybridiiadon signal dijtrlbutlonj for the Individual mismatch types, arranged 
according to the median values (the 95% confidence bounds are depicted by the notch), Boxes indicate the inter- 
quartile range (from ihe 25th to 75ch percentile) containing 50% of the data. Whiskers extend to a maximum value of I 5 times 
itie interquartile range from the boxes ends, differ significantly with a 95 percent confidence, Dau processing: raw Intensity 
dau, soluiion-background correction, subtraction of the mean profile, normaliution of the defect-type dependent deviations 
from the mean profile by division by the standard deviation of the defect profile (see Methods section). The mismatch types 
with the lowest hybriditaiion signals are those (T G, C C. T C, A C, G G) where C G base pairs are affected by the mismatch 
defect. The only exception is A G. The positive tails of this and other distributions seem to originate from stab.Niing C G base 
pairs next to the defect. &C;, (standard deviation assuming that the various MM nearest-neighbor types are equally distrib- 
uted) 
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Figure 5 

(A) Single base Insertion defect pronie (hybrWiiatlon signal plotted against the Insertion base position: foMow 
Ing solution baekground-correction of the raw Intensity data, hybridization signals were normalized with 
respect to the largest hybridization signal in the Insertion profile) of the probe sequence motif J '- 
CACGTCCTCTCCCCTCACCTTAAG-S' (complementary to the target URA). Symbols correspond to msenion 
bases (A red crosses; C green circles; G blue iUrs; T cyan triangles). The mean profile (black line), obtained from the moving 
average (including all 4 Insertion types) over positions p • 2 to p + 2 shows the common positional dependence. Insertions to 
the left and to the right of an identical base (Croup // bulges - see text) result in Identical probe sequences (B) and (C) Devia- 
tion pronies Positional influence is mostly eliminated by subtraction of the mean profile. Elevated intensities are observed for 
Croup (/ bulges (eg. C insertions at positions 1 1 to IS, 6 to 7 and 18 to 20 or G insertions at positions 4 to 5 and 7 to 8) A 
very distinct increase of the hybridization signal is observed for C insertions into the subsequence TCCCCT in the middle of 
the sequence. As shown in (C) Group /( bulges (red markers) have significantly higher Intensities compared to Group / bul«5 
(blue markers). r e, • 
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Figure 6 

Direct comparison of single baie miimatchcs, Insertlonj and deletions. The 16 mer probe sequence motif 3'- 
TTGACTTTCGTTTCTG-S' Is complementary to tlie target BEf, Hybridisation signals (data processing: raw fluorescence 
intensities; solutlon-background correction) of single base mismatch probes with substltuent bases A (red crosses) C (green 
circles), C (blue stars), T (cyan triangles), running average of mismatch intensities (blacl( line); perfect match probe signals (grey 
symbols) single base insertion probes (solid lines) with insertion bases A (red), C (green), G (blue), T (cyan). Hybridization sig- 
nals of single base deletions (orange dashed line) are comparable to that of mismatches at the same position. Increased hybrid- 
ization signals of certain Insertion defects are due to positional degeneracy of base bulges (see discussion) 



nal data from 1000 differenl 20-25 mer probes indicales 
ihe general validiiy of the result 

Interestingly, sysiemaiitally increased hybridization sig- 
nals (with respect to the averaged hybridization signal 
level from other defect types at the same position) have 
also been observed for certain Croup 1 bulges: For C-inser- 
tions next to a T (e g. in Fig, 5 at base position 1 5) we fre- 
quently find increased binding affinities similar to that of 
Croup II bulges. 

We funher analyzed the degree of correlation between the 
binding affinities of probes with differenl insertion bases 
X and y (see Additional file 5)- A clear correlation appears 
between the hybridization signals of probes with T- and 
C insenlons, and also, though less distinct, between A- 



and C-insenions. In contrast, we observed an anti-correla- 
tion between G- and A-insenions 

DNA/DNA venut DNAIRNA mitmatch ond bu/ged 
/lyfrridliot/on 

To investigate if the above results from DNA/DNA hybrid- 
ization also apply to hybridization of RNA/DNA duplexes 
we performed a direct comparison employing DNA tar 
gets and corresponding RNA targets on the same microar- 
ray. We observed that MM discrimination in RNA/DNA 
duplexes is similar to MM discrimination in DNA/DNA 
duplexes (see Additional file 6) A statistical analysis (see 
Figs 9 and 10) reveals, however, that purine-purine MMs 
are (with respect to the ranking order of MM stabilities, 
Fig. lOb.c and Fig, lOd) somewhat less stable in RNA/ 
DNA duplexes than in DNA/DNA duplexes. The most sig- 
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Figure 7 

Compsrijon of the bybrldlwtlon signalj of different point mutation typev To minimize positional influence the sta- 
ustics include only defect positions 5 to II, located m the center of the 16 mer probes. The 1200 probe sequences were 
derived from 17 probe sequence motifs. Data processing: raw fluorescence Intensity data; solution-background correction' 
hybridization signals are normalized by division by the corresponding perfect match hybridization signals. Defect categories- 
mismatch M-X (X: subst.tuent base); mismatches a< A T and C G sites M@AT, M@CG; single base deletion D; deletions at 
A'T and C G sites D@AT. D@CG; single base Insertion l-X„„ (X: insertion base. I/II: Croup I/Croup » base bulge) Hybridization 
signals from insertion probes (about 50% of the PM hybridization signal for Group / , 65% for Croup // -median values) are sixnif- 

°' <" Mismatches at A T sites result in about 25% larger hybridization signals 

Chan MMs al C G sites. Deletion probes have a median hybridization signal that is slightly lower than the median MM hybridiza- 
tion signal Croup / base bulges with the exception of l-A, (33%) have hybridization signals of about 50% of the PM hybridization 
signal. Hybridization signals of Grouf> (( base bulges are (with the exception of T-insertions) significantly higher than that of the 
corresponding Group / bulges. 
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Figure 8 

Boxplou show the hybridi«t(on signal deviations (from the mean profile) for the different Insertion b»se types 
\-|, 0|, r,, A,„ c,„ G||, T„), which are differentiated according to sffillatlon to bulge Group llll. Data process, 
mg. raw intensity data solution-background correction; subtraction of the mean profile yields the defect-type dependent con- 
tribution of the hybridiiation signal. The statistical analysis includes about 1000 hybridization signals from 12 different 20 to 25 
mer probe sequence motifs. 



nlficani differences between RMA/DNA and DNA/DNA 
MMs {see Additional file 7) are observed for the MM-iypes 
G ■ A and A G (more stable in DNA/DNA duplexes) and 
for the MM-rype T G (which is more stable in RMA/DNA 
duplexes) A presumed destabilizing effect of purine- 
purine MMs in ihe ranking order of RNA/DNA mismatch 
discrimination (I'Ig. lOc) is superposed lo the affened 
base pair efTeci (C G vs. A T - see above), which Is very 
similarly, also observed in DNA/DNA hybridization. 

For bulged duplexes we did not observe significant defect- 
lype specific differences between RNA/DNA and DNA/ 
DNA hybridization- The hybridization signal and 
sequence daia from the microarray hybridization experi- 
mem are provided in Additional file 8, 



Single bose Iniertlon, deletion and mitmatch defects In 
comparison 

Defecl profiles for MMs and base bulges (Fig, 6) exhibit a 
very similar quantitative influence from defecl position in 
DNA/DNA as well as in DNA/RNA complexes. For mdi- 
vidual sequences the mean trough-shaped profile can be 
altered: Fig, 3 shows deformations of the trough-like pro 
file on scales much larger than the size of a base pair. 

Single base MM discrimination also depends on the type 
of MM base pair and the corresponding PM base pair 
(which has been substituted by the MM) Hybridization 
signals of MMs (normalized with the respect lo the PM 
hybridization signal) originating from C C base pairs are 
about 25% smaller (in the median) ihan for MMs from 
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Figure 9 

Comparison of DN/W DNA and RNA/DNA mismatch hybridliation signals - statistical analysis, (A) MM-type 
related '""uence m DNA/DNA oligonucleotide duplexes, The positional mnuence was eliminated by subtraction o( the moving 
average MM profile^Subsequent normaliHtion was performed by division through the mean hybridization signal of the pargcu 
lar MM P^f W MM-,ypeyel,ted Influence In RNA/DNA oligonucleotide duplexes, Hybridization signal differences between 
the pairs of RNA/DNA- and analog DNA/DNA-duplexes are shown in Additional file 7, 
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a) DNA/DNA hybridization (large data set) 

G A>T T>G T>A G>C T*C A>A A«T G~C C=«T C>A C>G G 

b) DNA/DNA hybridization (small data set for direct 
comparison with RNA/DNA hybridization) 

G A>T T>A G«C T«G T^A A>C A>A C>T G>T C;*C C>G G 

c) RNA/DNA hybridization (small data set - equivalent 
to the DNA/DNA dataset in b) 

T U>G A>T G«G U«C U*C A*C C«A A>A C>A G>T C>G G 

d) Difference between RNA/DNA and DNA/RNA 
hybridization signals, Uracil is considered like thymine, 
(TG to GT; Irna/dna> 'dna/dna ; AC to GA: Irna/dna*^ 'ona/dna) 

T G>C A>C C>T T>C T«G T>A C«T C>A A^G G>A G>G A 

Figure 10 

Comparison between DNA/DNA and RNA/DNA mismatch binding afflnitiej, (a) Ranking order of DNA/DNA mis- 
match binding afTinities (extracted from Fig. 4), (b) As amicipated the ranking order for DN/VDNA MMj obtained from the 
smaller subset of probe sequences (Fig. 9A) is very similar, The ranking order for the analogue I^NA/DNA MM duplex subili- 
ties (c) (extracted from fig, 9B) reveals significant differences in comparison to (b). In part (d) MM-types are ordered accordlne 
to the hybridization signal differences between RNA/DNA and DNA/DNA MMs (as extracted from Additional file 7) Purine- 
purine MMs (purine bases highlighted in blue) display the largest decrease of binding afinlties with respect to other MM-types 



A Tbase pairs. Single base deletions affecting C C base 
pairs result in about 30% smaller hybridization signals 
than deletions affecting A T base pairs. The deletion pro- 
file in Fig 6 (orange dashed line) shows that ihe local ups 
and downs of the profile curve correlate with deletions 
affeaing either A Tor C G base pairs. Thus, for MMs and 
single base deletions it is the type of base pair affected by 
the point-mutation, which determines the impart on 
hybridization affinity lo an important degree, however, it 
is still less imponant ihan defect-position 

We also observe a noticeable influence of the next-neigh- 
bor bases of the mismatch (see Additional file 2) 

Discussion 

Dominating Inpuence of defect potltlon 

We observe lhal defects located in the center of the oligo- 
nucleoiide duplexes are significantly more destabilizing 



than defeas at the ends 1 1 0|. Similar influence of the MM 
position has been reponed previously from other micro- 
array based studies |7,9|, and also - although sparsely - 
from hybridization experiments in solution [25,26] The 
limited data In solution may be due to the technical diffi- 
culty of studying a large number of different probes 
Quantitatively, in accordance with |7) we have Identified 
MM position (relative lo the duplex ends) as the strongest 
infiuenlial faaor on the hybridization signal, when com 
pared to MM-type and nearest neighbor effects 

The well-established two-stale nearesl-neighbor model, 
which has proved to be reliable for the prediciion of 
duplex stabilities in solution-phase, does not regard (he 
posilion of the (mismatched) NN pairs |6| We propose 
that a model for the prediction of microanray binding 
affinities should also include the position of the ,MN pairs 
- in particular in case of mismalched NN pairs Affinity 
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models for microarray bybridization considering a posi- 
tional dependence of the nearest-neighbor parameters 
have been previously discussed in |12,27-30| 

We observe a very similar position dependence for single 
base bulge defects as for single base mismaiches. Also, (he 
magnitudes of ihe impacts of the MMs and base bulges on 
the hybridization sigrial are very similar (apan from the 
relative high binding affmliy of Croup II bulges). This con- 
sistency suggests a common origin of the positional influ- 
ence, independent of defect type 

Sierical crowding at the surface, as suggested by Peterson 
ei al. |3 1 1, can in principle reduce (he accessibility of the 
probe surface-bound 3'-ends and can thus decrease the 
impact of defeas located near this end. However, in our 
case we observe largely symmetrical intensity profiles with 
respect lo both ends of the probes (Fig. 2). 

Focusing on individual probe sequence motifs we 
observe, that the positional influence does not only 
depend on the defea-lo-end distance, but also has a 
sequence-dependent contribution. This indicates that the 
impact ofa defect also depends on the stability of the local 
sequence environment (beyond the nearest neighbors) 
Since there are no long range molecular forces, we infer 
ihai the molecular dynamics must play a role, effects like 
breaihmg bubbles or zipping could be al ihe origin This 
influence of the duplejt sequence and the observed sym- 
metry of the defect positional influence with respect to ihe 
duplex ends suggest that end-domain opening (ie 
sequential unzipping of the double-helix from the dupleji 
ends) must be suspected to be a key mechanism for under- 
standing Ihe influence ofdefea position on duplex stabil- 
ity. 

In^i/«nce of t^e /VWI-type 

Removing the positional influence in our data, we see that 
single-base MMs introduced al the site of a C G base pair 
result in a larger decrease of the hybridization signal (with 
respea to the PM hybridization signal) than MM defects 
affening A T base pairs The same applies for single base 
deletions (see Kig. 6) These experimental results (l^lg. 4), 
in accordance with nearest-neighbor thermodynamic 
parameters for Waison-Crick base pairs |6|, mainly reflect 
the increased base siacJ<ing and hydrogen bonding inter- 
actions of C G base pairs. We observe a positive correla- 
tion between the experimentally determined single base 
mismatch discrimination and predicted free energy incre- 
ments cSaC]? (between MM and PM duplexes) on the 
basis of the nearest-neighbor model - for details see Addi- 
tional file 2 A similar correlation (between log;(PM/MM) 



hybridization signal values and il^C]-,) has been 
reponed previously in |9|. 

We emphasize the good correlalion between our DNA/ 
DNA MM stability order (Fig. 1 2e) and the corresponding 
results ofWickef a/, |9| (the MM stability order in Fig, 12d 
was extraaed from the plot of logj(PM/MM) hybridiza- 
tion signal values in Fig. 5a in |9)). A major difference, 
however, occurs for the MM-pair G G, which is the least 
stable in our study Wick (and also Sugimoto |32|) found 
G G 10 be one of the most stable MMs. Interestingly, 
however, Pozhitkov si a\. |7) - in accordance wilh our 
results -identified C G as one of the least stable MM- 
rypes 

Our direct comparison between DNA/DNA and RNA/ 
DNA hybridization on microarrays reveals - for RNA/ 
DNA duplexes - an increased desiabilizaiion of purine- 
purine mismatches, wilh respect to other MM types. An 
explanatory approach for the observed differences 
between DNA/DNA and RNA/DNA binding affinities is, 
that purine-purine MMs cause larger sieric hindrance in 
the A-form hybrid duplexes than in the B-form DNA/DNA 
duplexes 

In contrast to |7) we did not observe that purine-purine 
mismaiches in RNA/DNA duplexes are, in absolute terms, 
more discriminative than other MM-types 

/ncreojeij rtoWlfty of Croup II tingh base bulges 

We observe significantly increased hybridization signals 
of single- base insenion defects in which the insertion 
base is placed next to a like-base. Our investigation shows 
that (on the microarray) the difference between Cruup I 
.and Croup 11 binding affinities iS^,,,, (inferred from ihe 
hybridization signal 1) is distinctly larger than the defect- 
type related variation of binding affiniues <5/„„ (see Fig 
7). For comparison, the free energy differences among the 
MM trinucleotide duplexes abc/axc and abc/Syc (mis- 
matched bases x and y; neighboring bases a and b 
unchanged; overline denotes complementary bases) span 
the range &iCj,'„ . 0.5 to 2 6 kcal/mol (calculated with 
MM nearest-neighbor free energies (33| for T" 37"C), 

The increased stability of Croup II bulges in comparison 
with Croup I bulges has been investigated previously in 
solution rather than on microarrays |24,34,35) According 
to Ke and Wanell [■34) the increased stability of Croup ;/ 
bulges originates from positional degeneracy of the base 
bulge. Additional conformational freedom, entailing 
higher entropy, results in lowered duplex free energy (thus 
in increased stability) According lo Zhu ei al |2<)) posi- 
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lion degeneracy accounts for an average siabilizaiion of - 
0,3 to -0.4 kcal/mol (in agreement with the theoretical 
estimate |24| of -R T- In 2 > -0.43 kcal/mol at 37'C) for 
a two-position degeneracy Znosko ei al (35) reponed 
Croup II duplexes to be on average tfaC" - -0 8 kcal/mol 
more stable than Croup I duplexes. The latter value 




Group I base bulge 



matches better our obseivaiion of Croup II hybridization 
close lo the perfect match hybridization signal. 

For explanation of the large binding affinity of Croup II 
duplexes we propose the following mechanism (illus- 
trated in Fig 1 1) based on the molecular zipper model 




Group II base bulge 



GCATCTGGACAAGTCAGGTC GCATCTGGACAAGTCAGGTC 

CGTAGACCTG^^-» zipping 
- "^03 up 



GCATCTGGACAAGTCAGGTC 
CGTAGACCTGTT;g ^^ 

n frameshift ^^""^"^^^Q 
\y ■ zipping blockec 

GCATCTGGACAAGTCAGGTC 
CGTAGACCTG?7^t^ partial 

'"^"^^ Q unzipping 



GCATCTGGACAAGTCAGGTC 
CGTAGACCTGTTCAGTCCAG 



GCATCTGGACAAGTCAGGTC 



CGXA^ACCTGTT2»r 



r-i frameshift ^""^^i^^^p 
J I ■ zipping blocked^^^ 

GCATCTGGACAAGTCAGGTC 
CGTAGACCTGTTCAGTCCAG 
¥ 

Any of the degenerate T's adopts 

bulged conformation 

rapid zipping of the duplex 



A in bulged conformation 

"■ rapid zipping of the duplex 

Figure I I 

Proposed mechanism for the Increased binding affinity of duplexes with Group II bote bulge$. The Group / base 
bulge (A), originating from the Insertion of the unpaired base 'A', creates a I -nt frameshift between the complementary probe 
and target sections, and thus acts like a barrier delaying the formation of a stable duplex. The bulged 'A' needs to adopt a favo- 
rable (e g looped otit) conformation, so that the frameshift is compensated and the ripping of complemenury base pairs can 
continue. Unlike the Group / base bulge in (A) the Croup II base bulge in (B), originating from the insertion of the surplus base T' 
next to another T', is degenerate. Since there is an increased probability that any of the two degenerate bases adopts a favora- 
ble conformation, while simultaneously the subsequent base Is forming a base pair with the corresponding base in the opposite 
strand (so that the frameshift is overcome), the formation of a suble duplex is accelerated. 
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|36,37|: Kven in thermal equilibrium due to thermal exci- meniary base in the target sirand. From this point aipping 
laiion, zipping (consecutive base pairing) as well as can progress rapidly Therefore compared to Waison- 
unzipping occur The extend of the end-domain denatur- Crick nearest-neighbor pairs, a base bulge (similar lo a 
ation of the duplex, which is described by a random walk mismatch) has a decreased ratio of zipping/unzipping- 
(biased by the duplex sequence), may Tinally result in rates ft./ft and thus favors unzipping of the duplex (i e the 

complete dissociation of the duplex. Ttie binding affinity duplex dissociation rate is increased with respect lo 

IS deiennined by the ratio h„Jk^,^ between the nucleation the perfectly matching duplex). For Croup II bulges the k / 

rate k„^, and the duplex dissociation rate A^^,, We consider k. ratio at the defect site is increased with respect to Croup 

that a defect does not have an important influence on the / bulges: in case of a Croup II bulge there is an increased 

unzipping, since the defea does not present a ban-ier for probability that any of the degenerate bases makes way 

the process. For closing of the strands, however, the situa- (and adopts, for example, a favorable flipped-out confor- 

lion IS different. The surplus (bulged) base must aa as a mation) while simultaneously the subsequent base forms 

kinetic barrier, intemipting the rapid zipping of the a base pair This is due to the increased number of possi- 

duplex. The I-nt frameshifi between the (largely) comple- ble molecular conformations, which can lead to continu- 

mentary strands, ownng to the unpaired bulge base pre- ation of the zipping. Then, as the frameshlft is 

vents closing beyond the defea and results in a panially compensated, the rapid zipping to complete the duplex 

zipped, and correspondingly weakly-bound, duplex. occurs. Since the nucleation rate J!„„, of Crcwp / and CVai^p 

Duplex closure can only progress if the interfering surplus !l duplexes may be assumed to be the same, the binding 

base is giving way (adopting a favorable looped oul or affinity of Croup II duplexes must be increased 
stacked conformation), thus allowing the subsequent 
base to form a Watson-Crick base pair with the comple- 

a) Solution hybridization DNA/RNA (Sugimoto et al. 2000) 

T G»G U«G G>G A«A GwC A>A A!«T Uj«C U>A C«T C 

b) Microarray hybridization DNA/RNA (Pozhitkov et al. 2006) 

T G«T U*T C>G U»A C«C C«C U»A A«A G>C A>G G»G A 

c) Gene silencing RNA/RNA (Schwarz ei al. 2006) 

Silencing efficiency depends on the single base mismatch between 
themRNAand si RNA sequences 

C A>U G«C U>U U>A C«G U«C C>U C>G A>G G>A A>A G 

d) Microarray hybridization DNA/DNA (Wick et al. 2006) 

G T«G A>T T>G G>T G>A G«C T«A.A>C A>A C>C C«T C 

e) Microarray hybridization DNA/DNA (this study) 

G A>T T>G T>A G>C T«C A>A A«T G«C C«T C>A C>G G 

Figure 12 

Stability orders of MM-types X y for hybridization In solution (o) and on microarrayi (b, d, e), m the microarray 
experiments (6, d and e) MM binding affinities have been normalized with the corresponding PM binding affinity, whereas Che 
orders a) and c) reflect the absolute impact of the MM pairs on duplex binding affinity For the microarray MM-pairs (,n b d and 
e) the probe base X (DNA) Is on the left and the target base i (DNA or RNA) is on the right. The efficiency of RNA interfer- 
ence (c) (from (2)) IS assumed to be determined by the stability of A-form RNA duplexes between the fl/SC-bound Fu/de sirsnd 
'ok',. «°"'Pl«'"«""'7 mRMA, The left base X is part of the guide sirand (at position 10) and the right base f is part of the 
mRNA, Apart from the the base pair X y the mRNA and siRNA sequences remained fixed. In (a) to (c) purine bases are high- 
lighted m blue. In (d) and (e) mismatches with respect to a perfect matching C G base pair are highlighted in red Deuils on the 
individual stability orders are provided in the text. 
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Previous nuclei - Indudlng RNAIDNA hYMdIiatlon 

Tautz and coworkers |7| performed a mismatch siudy 
with 20 mer oligonudeoiide microarrays fabricated by 
light-direcied m situ synihesis wilb tlie Geniom* One 
insirumeni (febii biomed GmbH, Heidelberg) Similar as 
in our study, ihey compared normalized hybridization 
signal intensities 

However, an important difference between the experi- 
ments descnbed in |7) and our experiments is the use of 
irt vitro transcribed RNA targeis |7| originating from ribos- 
omal RNA Tfiey observe a more pronounced destabillza- 
tion by purine-purine M.Ms compared to our resulu 

A funher study on the impact of MM stabilities in RNA/ 
DNA duplexes, in solution rather than on a mlcroarray 
surface, has been published bySugimototifu;, |32| As dis- 
cussed in |7| the destabilizing effect of purine-purine 
MMs is not observed by Sugimolo el ai |32| However, the 
stability order in [32|, referring to AC" values of mis- 
matched trinucleotide duplexes, is considering absolute 
stability parameters, whereas 17,9] and our study consider 
mismatch discrimination with the corresponding PM 
binding affinity as a reference level Therefore, the compa- 
rability with the RNA/DNA stability order in [32j is lim- 
ited A recent work on the impaa of single base MMs in 
RNA-interference (RNAi) - allele-specific gene silencing 
experiments |2 1 - is interesting in the context of this study, 
since here the sequence recognition is based on base-pair- 
ing between ihe $uide itrand (a single RNA strand which is 
bound to the RISC complex) and a complementary 
mRNA, Schwarz ei al (see Schwarz: ubie 5b} have shown 
that among all MM-rypes incorporated at position 10 of 
the guide strand (apan from the point mutations the 
sequence of the guide strand was preserved) purine- 
pirrlne MMs resulted In the least silencing of gene activity 
(owing to a small binding affinity of the mismatched 
sequences), whereas U G, C U and U U mismatches 
resulted in a very efTicienl gene silencing (see Fig. 12c). Ii 
is assumed that purine-purine MMs strongly Interfere with 
the formation of an A-form helix between the guide strand 
and the target mRNA |38| This appears to be in accord- 
ance with the findings of Pozhitkov ei ai on RNA/DNA 
MM discrimination. However, the inferred RNA/RNA 
mismatch stability order (shown in Fig, 12c) is not nor- 
malized with the corresponding PM stabilities, but rather 
reflects the absolute impact of the MM base pairs in a 
given duplex sequence and cannot be easily compared to 
our study and to |7| 

Conclusion 

We performed a comprehensive, array-based study on the 
influence of point defects on (he binding affinity of oligo- 
nucleotide duplexes. Contrary to previous studies by oth- 
ers, we have employed well-defined hybridization 



conditions by using shon, end-labeled oligonucleotide 
target sequences (one at a time to minimize competitive 
effeas) and can therefore exclude that target secondary 
structure, steric hindrance, labeling or competitive effects 
are relevant for an explanation of the observed results 

In our microarray-based hybridization assays the binding 
affinity of mispalred duplexes is dominated by the influ 
ence of defen position The Influence of the defect-type is 
about half In magnitude, when compared to defect-posi- 
tion 

There is also an influence of the neighboring sequence, 
which has fanher reach than the defect next neighbor 
Although this long reach interaction must somehow be 
related to the base stacking energies, we did not find a 
simple description We attribute so far unexplained long 
range effects. In panicular a trough-shaped position 
dependence, to molecular dynamics. We propose a 
molecular zipping mechanism as a suitable explanation 
Zipping agrees well with the observation that Croup II 
bulges (bulges next to Identical bases) have stronger 
hybridization signals than expected from previous data 
Experimentally, it is not completely clear, whether the 
strong positional influence on oligonucleotide binding 
affiniry Is restricted to surface-hybridization or If it Is also 
relevant for solution-phase hybridization (maybe to a 
smaller extend) The comparison to other related work 
|2,7,32|, however, shows significant differences in the 
MM-type dependence of duplex binding affinities. Our 
comparative analysis of Ihe impact of point defects on the 
binding affinity of DNA/DNA and RNA/DNA duplexes 
reveals that purine-purine MMs are more destabilizing in 
the latter. This may explain some discrepancies in the lit- 
erature. 

The use of DNA microarrays enables a detailed Investiga- 
tion of oligonucleotide duplex binding affinities produc- 
ing a wealth of data In simple experiments. We 
demonstrate that imponant aspects (defect position Influ- 
ence, differences between DNA/DNA, RNA/DNA and 
RNA/RNA hybridization, suri"ace and bulk hybridization) 
about the impact of point defects on oligonucleotide 
duplex binding affinities are not yet understood Our 
results from simple, controlled experiments agree well 
with results from extracting data from complex DNA tar- 
get mixtures [7,9). This shows that DNA hybridization on 
surfaces can be reproducible and quantitatively signifi- 
cant. Deviations from the behavior, which we describe 
here, are observed in microarray experiments and they 
must be due to complexity of DNA target mixtures. 
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Methods 
Reagent! 

All reagenis were used as purchased wiihoui further puri- 
ficaiion. Unless specified otherwise aqueous soluiions 
were prepared wilh nuclease-free Milli-Q water (18.2 MfJ 
cm) 

Reagents u$ed in dendhmer-functionolized subslrate preparation 
20 mm round cover glasses (Menzel-Glaser, Braunsch- 
weig, Germany); Deconex I 1 UNIVERSAL (Borer Chemie 
AC, Zuchwil, Switzerland); (S-aminopropylj-iriethoxysi- 
lane (APTES) (Sigma-Aldrich), elhanol analyiical grade 
(VWR, Germany), 1,2-dichloroethane (Cat. No 6837.1, 
Carl Roth GmbH, Germany); phosphorous dendrimers 
with aldehyde moieties cycloiriphosphazene- PMMH-96 
(Cat. No. 552097, Aldrich); potassium hydroxide (Carl 
Roth GmbH); sodium borohydride (99.99 %, Sigma- 
Aldrich) 

Reo^encs and ioMons used in lighl-d'irecxed DNA- Chip synthesa 
RayDiie" phoiolablle 3'-nitrophenylpropyloKycarbonyl 
(NPPOC) phosphoramidites (NI'POC-dA(ta(;), NPPOC- 
dC(ib). NPPOC-dG (ipac), NPPOC-dT) were purchased 
from Sigma-Proligo (Hamburg, Germany). Aceionnrile 
(ROTISOUV for DNA synthesis, water < lOppm, Carl Roth 
GmbH, Germany); Activator 42 0.25 M (Sigma-Proligo); 
iodine based oxidizer (part no 401732, Applied Biosyj- 
lems). Phoio-deproieaion is carried out in a mildly basic 
solution of 25 mM pipcrtdine (99%, Aldrich) in anhy- 
drous acelonitrile. Final base deproteaion is performed in 
a 1:1 mixture of etylenediamine (analytical grade, Fluka) 
and eihano) (analyiical grade, VWR, Germany). UV glue 
(Norland optical adhesive 60, Edmund optics) is 
employed to glue the chip after synthesis onto a stainless 
steel support 

Hybridization buffer 

The hybridization buffer comprises 5 « SSPE pH 7,4, with 
eiiherO l%SDSor0.01%Tween 20; the initial target con- 
centration in the hybridization solution was I nM in all 
experiments. 

Targets oligonucleotides 

Cy3-labeled target oligonucleotides (DNA and RNA) - see 
Tab. 1 - were synthesized by MWC Biotech AG (Ebera- 
berg, Germany) and by IBA Nucleic Acids Synthesis (Gdt- 
tingen, Germany). 

Prtparaxlon vffhe phosphorus dendiimtr-functlonallzed 
tvbstroxes 

Dendrimerfunctionallzed substrates were prepared 
according to LeBerref(«/ |39|. For compatibility with the 
in siiu synthesis process (couphng of phosphoramidite 
building blocks) the aldehyde moieties of the dendrimers 
are reduced to hydroxyl groups. Reduaion is performed in 



an aqueous solution of 0.35% sodium borohydride (for 3 
hours at room temperature, under gentle agitation) After 
rinsing with MilliQ-water the slides are ready for use 
Long term storage for more than one year at 4'C (under 
air atmosphere) doesn't affea the substrates. 

DNA mfcrporroy fabricotlor} 

Oligonucleotide microarrays tailor-made for our experi- 
ments were fabricated in-house employing lighl-directed 
in siiu synthesis 1 14, 1 5) The design of DMD based synthe- 
sis apparatus |ie-2l,40| is described in Naiser al. \ 10| 
Microarrays were synthesized in situ on hydroxy funaion- 
alized phosphorus dendrimer supports The initial photo- 
reactive monolayer is created by coupling of NPPOC-dT- 
phosphoramidite Subsequent lighi-directed synthesis 
was performed with NPPOC-phosphoramidiie chemistry 
|4I|, 

Probe sets for the experiments are derived from various 
16-25 mer probe sequence motifs that are complemen- 
tary to the set of fluorescently labeled target sequences 
(Tab. 1) available for this study. On the DNA chip each 
probe set (comprising between 64 and 400 features) is 
arranged as a closely spaced feature block (see Additional 
file 9) whicli during the analysis can easily be imaged as a 
whole Compact arrangement reduces position-depend- 
ent systematic errors that can originate from gradients 
introduced during synthesis and/or hybridization (see 
below). 

DNA chips produced for this study typically comprise 
about 2000 to 3000 features. A relatively large feature size 
of 2) /mi (6 « 6 DMD pixels) is used to minimize image 
analysis related quantification errors 

Otigonudeotlde target hybridization on the mtcroorroy - 
measurtmtnt of the hybri<//iot(on tignol Intensity 

Hybridization of fluorescently labeled targets to surface- 
bound probes is carried out in a temperature-controlled 
hybridization chamber The chip, synthesized on a 20 
mm diameter cover glass (glue- fixed onto a stainless steel 
support), constitutes a window into the chamber The 
chamber volume of 1 50 m\ is formed by a cutout in a I 5 
mm sheet of PDMS silicone rubber. Temperature is con- 
trolled with a foil heater attached to a stainless steel plate 
composing the backside of the hybridization chamber 

Relative intensities within the probe sets are largely inde- 
pendent of the hybridization time, chosen to be 10 min- 
utes, typically. Probe sequence motifs wilh small 
hybridization affinities are hybridized for up to 30 min- 
utes to achieve a sufficiently large hybridization signal/ 
background ratio. Microarray hybridizations Hybridiza- 
tion temperature for 16 mer probes was typically 30*C 
An increased hybridization temperature of 40° C has been 
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applied for probes complementary lo Uie largei URA, At 
30°C these, due to their large hybridization affinity, 
hybridize with reduced defect discrimination. Probes with 
a length of 20 and more bases are hybridized at 40°C. 
Hybridization is monitored in real-time on an Olympus 
1X81 fluorescence microscope During acquisition of the 
hybridization signal the microarray is left in the hybridi- 
zation solution, A 10 « 0.4NA UPlanApo objective pro- 
vides a sufncienlly large field of view. An elearon 
multiplying CCD camera (Hamamatsu EM-CCD 9102) 
with a 1000 « 1000 pixel resolution is used for image 
acquisition During image acquisition shade correaion is 
performed to compensate for intensity inhomogeneliies 
in fluorescence exciiailon. 

Image analysis software developed in-house is employed 
lo read (he iniensiiies of hundreds of features simultane- 
ously 

Hybridization fignal anobfils - normalization 

Hybridization signal measurements arc performed with 
the microarray immersed in the hybridization solution 
Thus, the measured hybndization intensity signal 
is composed of the feature intensity 1/^, and the solution 
background intensity (originating from fluorescent 
targets floating above the microarray in the hybridization 
solution) The overall intensity = / (.x) (;,„, . 1^^) 

is affected by the function /(x) which accounts lor spatial 
variations of the fluorescence excitation and the light col- 
leciion efficiency of the microscope system (e g. due lo 
vignetting), Apan from lf„,,„^w also locally (i.e. next lo 
the corresponding microarray feature - see Additional file 
lOA) measure the solution background intensify 
/{*) ■ 'tart- A solution-background correction is performed 
by subtraction of the background fluorescence intensity. 
Further, by division by the solution background intensity 
/(") we cancel the feature-position related bias/tx) 

, _ f( ^)ilfeat-lb ack) 

< l^iti ^nr, ~ ■ : ' ~ III 



In the funher analysis we separate between the relatively 
strong defea positional influence and the defea-type 
related influence on the binding affinity, The positional 
infiuence is calculated as the moving average of mismatch 
hybridization signals (including all mismaich rypes) over 
a window of five consecutive MM-posllions By subtrac- 
tion of the mean profile we obtain the MM type depend- 
ent coniribuiions 51^^^ lo the hybridization signal 

To compare 31^^ from different defect profiles ii is neces- 
sary to account for ihe fact that the mismaich discrimina- 
tion depends on the binding affinity, ivlismaich 
discrimination is stronger in weakly-binding shon 
duplejies or duplexes with a large AT-conteni. Vice versa, 



in case of duplexes with larger binding affinities the differ- 
ences between PM and MM duplejies and among diffeteni 
MMs, respectively, may be rather small. We performed 
normalization of by division by the standard devia- 
tion o-^i,/,,, (see Additional file lOB), or, alternatively, by 
division by the average of all MM hybridization signals of 
the corresponding MM defect profile. 

Oeifgn of the DIV/4 c/i/f> experfmemi 

The nexibilily of the in si(h synthesis and the excellent spot 
homogeneity simplifies a comprehensive comparative 
analysis with the capability to delect subtle differences of 
the probe binding affinities. The experiments mainly dif- 
fer in selection and spatial arrangement of the probe 
sequences. Particular experiments focus on the exiraaion 
of the positional dependence, the companson of differeni 
defect rypes and on the identification of funher influential 
parameters 

Spatial variations of the photodeproteaion intensify and 
optical aberrations affecting the imaging contrast can 
result in gradients (as indicated in Additional file I IB) of 
the probe DNA quality (due lo a varying number of syn- 
thesis errors). Thus, for a reliable determination of subtle 
differences in hybridization affinities, probes to be com- 
pared directly should be closely spaced on the microarray 

In the following we describe the design of ihe individual 
experiments 

Sing/e base mismotch 

To Investigaie the positional dependence of single base 
mismatches and the impact of the mismatch type, we 
designed microarrays containing comprehensive sets of 
MM probes derived from a series of 25 16 mer probe 
sequence motifs, Position and type of the mismatch base 
pair were systematically varied, allowing us laier lo distin- 
guish between the dominating positional dependence 
and other influential faaors 

The features are arranged in groups of four (see Additional 
file 1 lA), corresponding lo the four possible substituent 
bases (A, C, G andT) at a panlcular base position A group 
comprises three mismatch probes plus one perfect match 
probe (PM) used for conirol Sixteen of these feature 
groups (one for each base position) are arranged in a 
square feature block comprising in total 64 feaiures (Addi 
tional files 9 and I J A) 

Sinjte btise bu/jes 

Single base insenions and deletions, due lo an extra 
unpaired base result in bulged duplexes with reduced sta- 
bility, A comprehensive study on the impact of single base 
insenions was performed using the chip design shown in 
Additional file llA, The experiment comprised aboui 



Page 19 of 23 

{J39g9 numbgr nol for citation purposes) 



BMC Biotechnology 2008, B 48 



httpv/wwwbiomedcentral, com/1472 -6750/8/48 



! ODD single base inscnion probes (insertion base type and 
position systematically varied) derived fronn rwelve 20 to 
25 mer probe sequetice motifs, 

Dirca componson of single base MMs and single bm bulges 
An experiment allowing for a direct comparison of PM, 
MM, single base insenion and deletion probes has been 
performed Probe sets were derived from 16 mer probe 
sequence motifs, complementary to the targets listed in 
Tab 1 For each of the 16 possible defect positions a set of 
9 probes (comprising four single base insertions, one base 
deletion, three MMs and one PM probe) has been created, 
To avoid that a regular arrangement of the probe features 
could possibly affect the measurement (e g, by introduc- 
ing a bias due to increased target depletion near a PM 
probe), the sets of nine probes were randomly arranged in 
3 « 3 matrices (Additional file 1 IB), 

Direct comparison between DNA/DNA and DNA/RNA mismatches 
The chip design (Additional file 1 1 B) and the experimen- 
tal procedures were basically identical with that of the pre- 
vious experiment Hybridization assays were conducted 
with fluorescenily labeled DNA targets and corresponding 
RNA targets (Tab I ) To avoid fabrication-related varia- 
tion of the hybridization signals the hybridization assays 
were performed on the same chip, initially with RNA and 
subsequently, after regeneration of the microarray (by 
heating lo 70'C in pure hybridization buffer), with the 
corresponding DNA targets 

Three microarrays were fabricated, each one focussing on 
one particular target sequence (COM, PET and LBE). Each 
microarray assay investigated single base MM and bulge 
defects for 6 different probe sequence motifs (obtained by 
shifting the 16 to 20 mer probe motif with respect to the 
longer target sequence). Two replicates of each feature 
block are employed to control for the reproducibility of 
the measurement 

Hybridization assays with the three microarrays were pet- 
formed independently and on different days The subsets 
of data obtained from the each of the assays display the 
same defect-type dependent trend for the defect-type 
dependent binding affinities Vet smaller subsets from the 
individual defect profiles (originating from a single probe 
sequence motif] show basically die same trend of binding 
affinities which is, however, superposed by a strong 
sequence dependent bias 

Authors' contributions 

TN developed the e.xperimental setup, performed the 
experiments, carried oui the data analysis and drafted the 
manuscript OE aided in DNA chip synthesis and data 
analysis TM participated in the development of the DNA 
microarray synthesizer WM and |K panicipated in data 



interpretation and helped to draft the manuscript AO 
conceived of the study, and panicipated in its design and 
coordination and aided in drafting the manuscript 
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AdcJitionai file 1 

Th» WW hybjidiuiiKm signal mienntm of the 16 msr protfj in it micro- 
amy hybndizaiion eipenmtni on imj/f batt rmimatch dummmuMn 
Thi dam wm exiracied /rum flumifcmce micrographs (iO Ini iny scale 
TIFF masei) eftht hybriiiiztd microarrayi. The daumi compruts the 
hy\m<i\uitton signal raw ilaia and probe/iarget (njwrtwj 0/24 mumatch 
dtftci profilts. 
Click hert fornie 

(hup//wvAv,biome<iceniral,com/conieni/supplenieniify/M72. 
6750-8-48-51 ixii 

Additional file 2 

Corrslamn biimm Ihi MM-type relaitil hYtnytaamn sip'al lievMiwnj 
from ihe mean profile M^^and the prediaid CMs free energy inmmenu 

SAG], between MM and carrtsponding I'M duplexes !i AC'i, was 
calmlaitd from mismauh NN-parameiers jti Hybridiiaiicm signal data 
proeesiing as described in ^Miional file W. The MM cype 11 caiegomed 
accordrng w the MM bme pair K V (m A) and according lo the /lanhng 
bare pairs (in Bj. Data points indicate the median il^ofthe indmdual 
mismaich/fknking biue pan categories Vie small number of data unihin 
the indmdual categories (owing to the combinaumal mcrMsr of mis- 
maich/nearisl neighbor caiegones) can result in outliers- The exact MM- 
caugory corresponding to each daiu pemi can be identified by loolimg up 
Ihe symbols in the identical plou in (A) and (B) Pan (A) shows a uvali. 
approximately linear correlation between 5i„„ and 6 AC]-,, indicating 
that the MM dtscnminaiwn on microarrays can be rtlatid to MM near- 
est-neighbor paramueri established from soluiian-phase hybridaaiion A 
relatively mail mtsmatch diicnminaiion can be obsenvd for a vanety of 

MM-types uith 6 AC'j7 <: ,15 Itcal/mol Part (Bj mcticaies ihat mis- 
malch-rypes wiifi C C-flanliing base pairs at both sides have (on average) 
larger hybndizaiion stsnals than mismatches with A T flanking base pairs 
at bath sides. Among the more stable MM types above the trend hne 
(which serves the purpose w (p(ii each of the indivutual MM base pair type 
related clusters - shown in pan A - in two halves ) 20 have C G -only 
flanking pairs and 19 have A T-only flanking pairs In contrast, among 
the lest stable MM-types - below the trend line - only 1 3 have C C-only 
flanking pairs, whereas 29 have A T-only flanking pairs. 
Click here for file 

I hiip://www,biomedcenlr3).coni/conleni/jupplen)eniary/ 14 7 2- 
6750-8-^8-52, eps| 

Additional file 3 

The raw hybridization signal iniensilies of the 22-2S mer probes m a 
microarray hybndiution eiptrimeni on the binding affinity of bulged 
duplexes Hybridization signal intensities were extracted from /hions- 
cence micrographs ( 1 6.bii gray scale TIFF images) of the hybridized 
microarrayi. The daiasel comprises the hybridiialion signal data and 
probe/target sequences of 14 defect profiles 
Click here for file 

(hiip,//www biomedceniral com/corvieni/sgpplemeniary/H??. 
6750 e-19 S3 txlj 
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Additional file 4 

The raw hybridizniion i^gnal miensum ofihtli met prebu m a micro- 
array hybntSmlim trpermem designed jot a dma comparison betwien 
tnnilmg affinmes of singli base mismauhes and smgk traie bul$ii The 
tiau was exiracuat from flmrncenct micrographs (16-bii gray scale TIFI-' 
imaggi] of ihf hybndiieti micraaTrayf The daussei compnwi ihe hybrid- 
uatim iisnal raw dau and prabe/targei itqutnces of 11 dtfea profiles. 
Oick here for filf 

|htip7/www biomedceniral com/conieni/iupplemenisry/ M72- 
6750-8-48-S4 ui| 

Additional file 5 

Hi!Wsrams of hyindizaaon signal difftrencei fX /Y urn* y denne the 
di/fermi insertion basis in otherwise identical probe sequences) reveal cor- 
lelaiwni beiuiten the hybruliiation tignali of different insertion lypes. To 
exclude the impact of lysummically inneased inttnsiim oj Croup II 
inmiions only Croup 1 inseniom are regarded here Beiuieen T- and C- 
mseriions {and beiiveen C and ^ insertions} a correlawn, as indicaitd 
try a narrow tlisiribuiion with a prtmoiinced peak near uro, u observed 
Thf brood dtiinhution oj hybriduation iigfials differences between V, and 
A insertions doesn't show a distinct peah. Indicating that there ts no cor- 
relation but rather an tinii-correlation for insertions of A and G. 
CHcli here fur file 

|hnp//www biomedceniral com/conieni/9upplemcniary/U72- 
6750-8-48-S5fpi| 

Additional file 6 

Comparison of DNA/DNA and RNA/DNA mismatth /lytiv/uaiion sig- 
nals - mismatch defect profiles Paru A-D make a direct comparison of 
hybhdiiaiion signals, obtained from siibse<tuer)i hylmdizaiion of RNA tar- 
gets (top) and DNA targets (bottom) on the same microarray The defect 
positional influence is identical for DNA/DNA and RNA/DNA hybndi- 
nation HtJwever. the impact of MM-types meals systematic dif/erences- 
The set{umces shown in die plots are the probe seijuence motifs that have 
been modified by base substitution The hyiridiiation signal fina u j is 
plotted agmnsi the defect position Hybridizanon signal processing solu- 
tion backgTcnind correction (see Methods section) Substitution bases A 
(red cross). C (green circle), C (blue star) and T (cyan tnangle) either 
result in J MM duplexes nnd one PM duplex al every defect posiiton; 
Hybrtdiiattan signals of duplexes itnth single bast deletions (yellow line), 
moving average MM hybridiuition signal (black line) 
Click here lot file 

|hlip//www biomedceniral com/conceni/supplemf nury/N7?. 
I5750-8-48-S6 ep«| 

Additional file 7 

Hybruiiuiian signal mriaiion beiwetn pnirs of mismatched RNA/DNA- 
and analog DNA/DNA duplexes (hybridimtron signals of DNA/DNA 
duplexes were subtracted from the hytmdiuticin signals of the correspond 
ing RNA/DNA duplextsj The largest differencfs between RNA/DNA 
and DNA/DNA binrfinj affinities were found for iHe MM-types T C, 
C. A and A C 
Click here for file 

|hiip //www biomedceniral com/conleni/supplemfniary/ 1 47J 
67S0-8-4fl.S7 pps| 



Additional file 8 

77k raw hybriduaiion signal intimities of litis microarray hybridization 
experiment provide a direct comparison beiivten RNA/DNA and DNA/ 
ON A hybridizalion Hybridization sig/nol tnttnsiry raw date was extracied 
from jliiorescence minoptiphs ( lO-bit gray scale TIFF images) of the 
hyltridiud microanays The dataset coniains the data of 3 independent 
expenmenis (performed with 3 different mtcroauays). Bach microarray 
dataset comprises the hybndtzcsiion signal daus and probe/target sequences 
of 74 i(f/«cl profiles 
Click here for file 

|hiip;//www biomedceniral com/cDnienl/jupplcmcnury/ 1473- 
S750-8-48 S8 on| 

Additional file 9 

Fluorescence micrograph of hybndtzed features (feature iia 21 iimj in 
the 1 6 mer mismatch experiment The shading-correcitd image shows two 
feature titoclis corresponding to two different I (5 me r probe seijuence motifs 
(3 -TrCACCCATATTACTC-5' - to the left. 3 -TATTACTCCaCCT 
CAC-S ' - to the right) both hybndutng ualh the fluoreicently labeled tar 
gel sequence COM (S'-CyJ-AACrCCCTArAATCACCrCCACTC-3 j 
£ach feature block comprises all iinj/e base mismatches of the particvlai 
probe sequence- Croups of four features (as indicated by the marked 
groups I and 2) correspond to each one of the 16 possible mismatch base 
positions. As indicated try the letters between the feature blocks the upper- 
most row of features in each group corresponds to an A base at the corrt 
sponding base position, folloivtd by probes with C, C andT (see also 
Additional file IIA). The bnghusi feature wiihin each group corresponds 
to the perfect matching probe Nonhytindize4 targets in the hybridiuiion 
solunon comnbuu to the backpound inienjity between the features. Mis 
match intensity profiles for the probe sequence motif 3'-TATTaC:TCCAC 
CTCAC-5' are shotvn in Fig, 2. 
Click here for file 

|hiip://ww biomedceniral.com/conieni/supplemeniaiy/ ) 472- 
6750-8-4 SSOepjJ 



Additional file 10 

Data analysis procedures (A) To reduce inunsiry gradients on the micro- 
array (bias desasbed by the spatially ivrying function l(x)) onginaiing 
from the fluorescence mnroscope optics (e g. due to inhomageneaus fluo- 
rescence excitation or tdgnetiing) we apply a bias correction procedure on 
the raw inlinsity data: The ^(x) component in the raw hybndiuition sig- 
nal intensity is canceled by normalization with the local solution bach 
ground fluorescence inunsity!(xj l^,,,. (8) Normalization of the MM 
type dependent component 6I„„ of the hybridization signal is necessary 
since the magnitude of mismatch discnmination depends on the binding 
affinity of die sequence motif The defect profile in a) shows a large MM 
discrimination (typical for a weaMy bound duplex), whereas the defect 
profile in b) shows a small MM discrimination (typical for more strongly 
bound duplexes) In the position-independent defect profiles (right) the 
positional influence (obusinedas the moving average of all MM types ovei 
five consecutive defect positions - shown as a bold line in the defect profile 
in the left image) has been subtracted, to yield the MM-type dependent 
influence 61^,^,. For sixitislical analysis of the defect-type contribution, 
including comparable data from different defect profiles, normalimtion is 
performed by division through the standard deviation o„^„ of the posi 
tion-independent defect profile 
Click here for file 

|hiip://www biomedceniral com/conieni/supplemeni3ry/l472. 
C750-8.4B.S10 epj) 
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Additional file 11 

Microarray feaiurt anangements (A) for ihe >mj/» bue mimatch/iinjile 
bait mitnian expcrimenu (compart mjiJi AtUiiioml lil«9> Fur ihtdmci 
comparism btmeen single base MMs and lingli baif lw(j«s and far ihe 
umpariian of DNA/DNA and RSIA/DNA hybridkaimn ihe (touirt 



arrsmiemeni (B) was vstd ThU morr cempacl arran$mem of fiaturts 
/wi btfn choitn m mmimiit ihf mpacl of gradiem iffecu on the rtlalim 
hybricHzaiim jijrwl valvts of tht ran™ defta trpts. Tht <> fmturts 
belansmg u> tach itefici (wjiiiijn {depcied iviih dashed boxei for positioni 
I am/ 1 0 J compme J ungle base MMs, 1 smsteMsi insertims, om tingle 
bur deletion and one ptrfea mmhin% probe. The gradieni mdicaud in 
( B) demonsiraies thai ihe enoneous wriaiion iWiAfn ihe closely spaad 
feaiurt set belongins to a panicular defea posHion is lignificanily smaller 
than for features iocaied far apart 
Click hert for file 

|hiip //www blomtdcemrjl com/fDmrnl/5upple^nen^^^y/I473■ 
6750.8•■^«■SM rp«| 
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Abstract 

Background; The propensity of oligonucleotide strands Co form stable duplexes with 
complementary sequences is fundamental to a variety of biological and blotechnolojical processes 
as yarious as mIcroRNA signalling, microarray hybridization and PGR. Yet our understanding of 
oligonucleotide hybridization. In paaicular in presence of surfaces. Is rather limited. Here we use 
oligonucleotide microarrays made In-house by optically controlled DNA synthesis to produce 
probe sets comprising all possible single base mismatches and base bulges for each of 20 sequence 
motifs under study. 

Result!: We observe that mismatch discrimination is mostly determined by the defect position 
(relative to the duplex ends) as well as by the sequence context. We Investigate the 
thermodynamics of the oligonucleotide duplexes on the basis of double-ended molecular lipper 
Theoretical predictions of defect positional influence as well as long range sequence influence agree 
well with the experimental resulu, 

Conclusion: Molecular zipping at thermodynamic equilibrium explains the binding affinity of 
mismatched DNA duplexes on microarrays well. The position dependent nearest neighbor model 
(PDNN) can be inferred from it. Quantitative understanding of microarray experiments from first 
principles is in reach. 



Background 

The well-known douhle-hellx .sirurture of nucleic acids 
results from sequenre-sperifir binding between romple- 
nieiuary single suands Sequential base pairing between 
A T and f: C, ba.se pairs along the iwo romplemenlary 
siijiids results in ihe formalion of stable duplexes. This 
so called hybiidizaiion process is fundaintnial to many 
biological processes and bioiethnologie.s Microarrays 
tonsist of surfart-ieihered probe sequences, which act as 



specific scavengers for their respective complementary 
tar^el sequence The molecular recognition enables a 
highly parallel detection of nucleic acid sequences In 
complex target mixtures. 1 lybridir-aiion also occurs with 
single mismatched (MM) base pairs, however, these 
duplcjies are significantly less stable than the corre- 
sponding perfect match (PM) 1 1, 2|. The single base pair 
mismatch-discrimination capability of short (■v20 ni) 
oligonucleotide probes provides an important diagno.slir 



Page 1 ot 12 

{pAgo number not lot cinvon purposes) 



BMC Bioinformatics 2008, 9:509 



http ://www. biomedcentral. com/1 471-21 05/9/509 



tool for the deteaion of pomt-mutations and single 
nucleotide polymorphisms (SNPs) (J|, DNA duplex 
stability arises from hydrogen bonding and base stacking 
mieraciions (the latter comprise van der Waals interac- 
tions, electrostatic and hydrophobic interactions 
between adjacent base pairs) According to the well- 
established nearest-neighbor model thermodynamically 
a nucleic acid duplex can be considered the sum of these 
nearest-neighbor (NN) interactions |4-6|. The binding 
free energy of an oligonucleotide duplex can be 
prediaed from the nearest-neighbor free energy para- 
meters: The helix propagation parameters (one for each 
of the 10 pos.sible base-pair doublets in case of a DNA/ 
DNA duplex) account for the duplex sequence. Funher 
parameters provide correnions for duplex initiation, A T 
terminal pairs or a symmetry penalty in case of self- 
romplemeniary sequences The NN model adequately 
predicts oligonucleotide duplex melting temperatures 
ii\ bulk solution |7|. Dataseis of Waison-Crick NN 
parameters |8| provide the basis for nucleic acid 
strunure and melting temperature prediaion softvi-are 
like the DfNAMeli web server |9| (UNAFold), the 
HYTHF.R server and others The NN model can be 
extended beyond the Watson-Crick pairs to include 
single base MM defens (7. in|. 

In spile of good knowledge about nucleic acid hybridi- 
zation in solution, the prediction of binding affinities on 
DNA inicroanays remains empirical. Recent microarray 
studies 1 1 )-l5| report, that the influence of even a point 
defect on hybridization signal intensity cannot be 
predicted easily In panicular the influence of defect 
position on the hybridization signal is stronger than the 
influence of MM-type (12, 14, I6(. 

F.xperiments show that the two-state nearest-neighbor 
(TSNN) approach (7|, which has been very successful in 
predicting duplex stability in solution, does not appro- 
priately describe MM binding affinities on DNA micro- 
arrays. The NN model does not account for the position 
of the individual NN pairs |7|. except for the outermost 
ones. Based on microarray data, Zhang ri nl |I7| 
proposed a position dependent nearest-neighbor 
(I'DNN) model The model assumes that the duplex 
binding free energy can be expressed as a weighted sum 
of Slacking energies with empirically derived positional 
weight parameters (I7-21|. The purpose of this study is 
to investigate the influence of point defects on (surface 
bound) hybridization experimentally and theoretically. 
Previous studies investigate mismatcJi discrimination 
with samples of very different sequence motifs (I I, I2|. 
However, other effects such as secondary structure 
formation or competitive binding may reduce the 
visibility of the impact of the MM-defecl on the binding 
afriniiy To avoid such complications we performed 



experiments with fixed sequence motifs We focus on 
smalt variations of the probe sequences We perform 
hybridization studies with home made microarrays 
comprising sets of very similar probe sequences. We 
use a single target sequence in each hybridization assay 
in order to avoid inter-target binding as well as target 
competition of different sequences for one and the same 
probe sequence. In order to avoid excluded volume 
interactions or secondary structure we limit the length of 
the target sequence to be of the order of the probes 
These simplifications (described in detail m |14|) enable 
a detailed investigation of the influences of defect type, 
defect position, flanking base pairs and the sequence 
motif on the binding affinity. The extensive .set of 
hybridization affinities obtained from our experiments 
enables us to perform a very complete analysis We 
compare the experimental data to theoretical modeling 
based on a double-ended molecular zipper approacJi 
(the double-ended nucleic acid zipper has been pre- 
viously described by |22-26|). We find that in order to 
reproduce the microarray hybridization signal in our 
model, the heterogeneity of binding affinities - mostly 
owing to in siiu synthesis-related probe defens (eg 
probe polydispersny) - needs to be taken into account. 
More than that, synthesis defects anse as useful for 
parallel detection of many different sequences. 

Methods 

DNA Microarray Hybrldltatlon Exptrlments 

Hybridization assays are performed on high-density 
oligonucleotide microarrays (see Fig. I). These micro- 
arrays (DNA Chips) are fabricated in-house 1 14| on the 
basis of light-direaed solid-phase combinatorial chem- 
istry |27, 28|. A "maskless" photolithographic technique 
1 13, 2y-31| based on a digital micromirror device type 
spatial light modulator (DMD", Texas Instruments Inc ) 
enables tailor-made design of DNA microarrays (with up 
to 25000 different probe sequences) on a laboratory 
scale. Point defects - single base substituiions, insertions 
and deletions - are produced in the rn situ synthesis 
process by variation of the nucleotide coupling scheme 
for the panicular probe sequence 

Protocols for the preparation of dendrinier functiona- 
lized microarray substrates (adapted from (32)) and for 
the light-directed .synthesis (based on NPPOC-phos- 
phoramidites (3.3)), as well as details on the hybridiza- 
tion assay and on fluorescence microscopy based 
microarray analysis (Fig. 1) are provided in Naiser ei ill 
|I4, I5|. 

In each microarray hybridization assay a probe set of 
cognaie probes with purposefully introduced point 
mutations - derived from a common probe sequence 
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Figure I 

Fluorescence micrograph (taken with in Otympuj 1X8 1 
epl-fluoreicence microscope and a Hamamatsu EM- 
CCD camera) of a microarray reatur»-blocl< comprising 
variations o1 the 1 6 mer probe sequence motif 
J'-TATTACTGCACCTOAC-S' Mlcroarm)' hybridization 
was performed with the 5'-Cy3-labeled RNA oligonucleotide 
target 3'-AACUCGCUAUAAUGACCUGGACUG-5' (target 
concentration: I nM In 5 " SSPE, pH 7 A, 0.01% Tween-20. 
T - 3Q°C). Each 3 « 3 sub-array comprises (randomly 
arranged) one perfect matching probe, three single base 
mismatch probes, four insertion probes and one single base 
deletion probe, In Fig. 2A the hybrldlxatlon signals 
(fluorescence intensities, averaged over the center of the 
microarray features) are plotted versus the defect position. 
The size of each microarray feature is 2 1 ym and the pitch of 
the array is 35 \jm. The significantly brighter feature-blocl< at 
left comprises variations of the 20 mer probe sequence motif 
3'.TTGAGCGATATTACTGGACC-5'. 



molif - is hybridized againsi a single target sequence, 
vi'hich perfectly matches the probe sequence motif. We 
.systematica lly vary defea type and defect position lo 
provide the complete "defect profile" of hybridization 
affinities with probe .sets We include not only all single 
hase mismaichci (MMs), but al.so, in order lo investigate 
mismatch discrimination in a broader context of other 
sequence defert.s, we rnn.sider single base bulges (origi- 
nating from iiisenions and deletions) as well as probes 
with multiple defects. Since the ca 130 probes within 
each probe set differ only by single ba.ses we are able lo 



distinguish between defect-positional and sequence 
influence. In our experimental conditions hybridization 
equilibrium is reached after a few tens of minutes. 
Further details can be found in 1 1 4 1. 

Results 

Position Dependent /n^uence of Singh Bote MMi and 
Bulges on Probe- Torjet Birnilng Afftnlty 

From the fluorescence micrograph (Fig. 1), we obtain the 
hybridization signals, which we plot as a function of 
defect position (Fig 2). We note a .strong inlluence of the 
defect position on probe-target binding affinity which i.s 
larger than the influence of the defect type We find that 
bulge defects display a very similar position-dependent 
influence on hybridization signal intensity to mis- 
matches, Funbermore we observe that the magnitude 
of mismatch discrimination (and bulge discrimination) 
at a particular defect position (i e the shape of the defect 
profile) depends on the duplex sequence. 

As ran direaly b« inferred frnm Fig 2, defects in the 
middle of the probes are most destabilizing. In the center 
of a 16 mer duplex a single nucleotide MM typically 
reduces the hybridization signal to 0-40% of the 
corresponding PM duplejt hybridization signal. Defect 
rype and nearest-neighbor effects have less influence on 
the hybridization signal than defect position Our 
experiments show a mostly monotonous decrease of 
hybridization signals over a range of typically 5-8 defen 
positions (for 16 mer probes and up to 14 positions for 
some 25 mer sequence motifs) from the duplex ends 
towards the center of the duplex. This is consistent with 
previous work (II, I 2|- 

In order to separate the defea positional influence (DPI) 
for a particular probe sequence molif from the defect 
rype related influences we run a moving average filter on 
the defect profile We observe that the DPI is not only a 
simple function of the distance between the defect and 
the duplex-ends, but it Is also related lo the nucleotide 
sequence (compare Fig. 2A and 3B and Fig, 3A and 3B) 

We also perform hybridization experiments on oligonu- 
cleotide duplexes wiih two single base deletion defects at 
varying positions x and y The results show that the 
binding affinity depends also on the relative position of 
the defects (for details see Additional file I, Fig. S5 and 
|26|) The hybridization signal is largest if each defect is 
located close to an end Lowest binding affinities are 
observed for defect configurations which divide the 
sequence into three roughly equally long subsequences 
Closely spaced defects (with a distance of less than four 
nucleotides) systematically increase their impart with 
distance 
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Figure 2 

The "defect profile" shows the position-dependent 
innpacc of single base mismatches, Insertions »nd 
deletions on hybridization affinity. Symbols: MM probes vvl* 
substituent bases A (red crosses). C (gretn daks), G {blue nors), 
T (/ijht blue trangks}; moving average of all MM intensities (bhck 
dasM o/rve): single base insertion probes with insertion bases 
A (red dos/wtotteci curve), C (green solid curve), G {bkie dotted curve), 
T {i/gfit Wue doshed curve). Defect profiles of diiferent probe 
sequence motifs. {A) Position dependent impact of various single 
base defeca on the hybridlacion affinity for the probe sequerKe 
motif 3'-TA"rrACTGGACCTGAC-S' (hybridized widi the 
complementary RNA arget sequence) Hybridizadon signals of 
single base deletions {orange dashed aim) are comparable lo that 
of MMs It the same position. PM probe signal replicates {black 
symbols on pay ground) serve as an indicator lor spatial bias on die 
microarray. Deviationj of MM hybrldliaOon signals from die mean 
profile are mosdy MM-type specific. Increased hybridization 
signals of certain insertion probes (where the bulged surplus base 
Is located next to identical bases - Group II bulges [H, 53)) are 
due to positional degeneracy of the bulge defects. (B) Position 
dependent impact of various single base insertion defects on the 
liyt>rldizadon affinity for the probe sequence motif 
3'-GTTTGAATCTCACGTCGTCTCCCC-5' (hybridized with 
the complementary DNA target sequence) Insertions of A (red 
crosses), C (green cirdrs), G (blue stars), T (fight bkiv triong/es), 
moving average of all insertion probe intensities (Woci< dashed 
aim). Systematically increased hybrldiiaOon signals of Group II 
bulges are discussed In Additional file I , Fig. S2, 



Discussion 

Single base mismaiches and base bulges alike show a 
strong, trough-shaped position-dependem influence 
biased by the considered sequence motif. Experimental 
evidence for an influence of the sequence context 
(beyond the nearest neighbors) on the stability of single 
base pair MMs has been reported previously (hybridiza- 
tion of short 31 bp linear oligonutteolide duplexes in 
bulk solution) by Benight and coworkers 1341, however 
such effects have not yei been systematically quantified. 
Tlie commonly used two-state model of nucleic acid 
hybridization between the microarray probe P and ihe 
target .strand T resulting in the formation of the duplex D 
is de.srribed by Eq. I . 



r> + T 



D 



(1) 



In thermodynamic equilibrium duplex nucleaiion 
(determined by the slow nucleation rate lt„„t) is balanced 
by duplex dissociation with the dissociation rate k^,„ 
The widely used iwo-siate nearest-neighbor model 
(including mismatched NN-diniers as described by 
|10|) cannot provide an explanation for this positional 
influence, it does not account for the po.siuon of the 
individual nearest-neighbor dimer.s We as.sume that the 
nucleation rate.s fe„„ of very similar duplexes (differing 
by a single base pair, eg a PM duplex and a 
corresponding mismatched duplex) are virtually identi- 
cal Thus, the positional dependence observed experi- 
mentally can be expected lo result from differences in 
/irf,„ In agreement with j25) we show that the positional 
influence originates from end-domain unzipping. Our 
experimental Tindings suggest a common mechanism for 
DPI, that is independent of the defen rype l-unher, the 
relatively long range of the DPI (Pig 3A and 3R) suggests 
that molecular dynamic.<i may well be a good candidate 
for an explanation The -symmetry nf DPI (with respect lo 
the duplex ends) and sequence-specific deviations from 
the symmetry indicate a zipping related mechanism 
Thus, in order to account for panial denatured duplex 
states, we use a double-ended zipper model uf the 
oligonucleotide duplex to determine mismatched oligo- 
nucleotide duplex stabilities as a funaion of defen 
position. We consider a situation in thermodynamic 
equilibrium. 

OoublC'ended Zipper Mode/ 

We check if a double-ended zipper model |22-25| 
(Fig. 4), considering end-domain-denaluralion only, is 
appropriate to describe the experimental observations, 
Iniernal denaiuration, due to the large bubble initiation 
barrier (owing to stacking interactions lowards both 
sides of a nucleotide) and due lo ihe relatively short 
length of ihe duplexes, is expected lo be negligible |22| 
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Figure 3 

Comparison of simglatlon results with the experimentally determined hybridization affinities for two probe 
sequence motifs {A) and (B). The four small sub-figures In the top section (from top left to bonom right) show the partition 
function Z and the duplex binding constant K as a function of defect position x (semi-logarithmic plots), the NN-free energies 
Ctg' of particular NN-pairs as a function of NN-pair position Xhh, and the statistical weight for complete duplex dissociation Wq 
as a function of defect position. Irregularities In Z(jf) at the duplex ends are an artifact caused by the fact chat only a single NN- 
pair IS affected by a MM-base pair at (he duplex end. The middle sub-figure shows the base pair opening probabilities (the 
fraction of strands in which the corresponding base pair at position Xj, Is uniipped) as a function of the defect position. The 
spectrum of differently colored curves encodes the different defect positions (red - defect at left end: purpk - defect at 
right duplex end). The bottom sub-figure compares the experimentally determined MM defect profile (mismatched base: A 
(red cross). C (green circfe), G {blue star). T (cyon tnongie), gray symbols correspond to PM probes) with the simulated MM defect 
profile //(*) (dashed orange lir>e). With Ag^f = I kcal/mol (at the simulation temperature of 325 K) and an error rate of 
12 percent (per synthesis step) the calculated defect profile Ofx) matches well the experimental data. 
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Figure 4 

Ooubte-ended zipper model of tlie oligonucleotide 
duplex. {A) Unzipping of the relauvely short duplexes Is 
Initiated it the duplex ends only. The end-domain opening, 
which progresses back and forth (nucleotide by nucleotide) 
in a stepwise, lipper-like fashion, can be considered a biased 
random walk The energy level of the partially denatured 
hybridization state Sj; (with respect to the completely 
hybridized ground state) is determined by summation over 
tha NN free energies of the unzipped NN-pairs (from I to k 
and from I to N), (B) Single base MMs (non-Watson-Crick 
base painng affect the stabilities of two adjacent NN-pairs at 
positions X and x + I. (Q Base insertions and deletions result 
m bulged duploxes with an unpaired base. The surplus base 
(depicted in a looped out conformation), similar as a MM 
defect, results in a reduced binding affinity. 



Using a partition (unciion approach ihe impact of poini 
defecl.s i.'i invesiigaled M thermodynamic equilibrium. We 
perform (his analyiically, independently of a particular 
sequence, as vfell as numerically with sequence-depen- 
dence - using unified NN-parameters |81. 

According to Craig ei al |35[ a kmetic scheme describing 
helix growth and di.ssocialion i.s given in Eq. 2 



p t r 



0. 



(2) 



/(, and k are ihc fast zipping and un?.ipping rates 
determined by Ihc nearest-neighbor propagation 
parameters of the individual base-pair doubleis The 
lime-evolulion of the otigoniicleoiide zipper can be 
con.iidered a bi.i.sed random walk wiih a finite 



probability for complete dissociation (described by the 
duplex dissociation rale *,,,„) Since we consider thermo- 
dynamic equilibrium, we can use a paniiion function for 
fast numerics. 



Portrt/on Function Apprvach (PFA) to Investigate 
Offfonuc/eotlde Duplex Thermodynamics 

We use a panition funaion approach |22-25| and 
investigate if the double-ended zipper model ran 
reproduce our experimenlal results On the basis of 
unified NN-Parameters (8| we calculate statistical 
weights of partially denatured duplex states. The effea 
of partial binding with respect lo microarray daia was 
discussed earlier in |24-26|. 

The paniiion funaion Zq of the duplex (Eq. 3) is the 
sum of the slatislical weights W;, i of all partially 
hybridized duplex states Si, , (see Fig 4) 



M -1 hi 

■ Z S 

lr"0 l=ki\ 



ry- 1 rv 

11' 



The slaiisliral weight wi,.i of the partially denatured state 
Su,i is calculated from the sum ACJ , of NN free energies 
of the unzipped duplex sections (Eq, 4) SlGI, can 
be considered as the free energy level of ihe partially 
denattired state 



^cii = ^ &g; dc;,^ = ^ Ag; 



(4) 



NN free energies of Waison-Crirk NN-pairs are deduced 
from unified NN parameters |7| 



Ag; = Afl," - T- Aj' 



(5) 



l-'or the completely dissociaied duplexes we esiimate 
panilions funciions of probes 7.r and targets 7., a.s 



Zp - Zt 



AC 



ry 



(6) 



For simplicity duplex initiation free energies have been 
neglected here Based on the duplex sequence we can 
now calculate the dtiplex binding constant 



K = 



ZpZf 



. In. 

,AGh/RT 



(7) 
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Con»/derot(9n of Point Defects 

We imroduce a defect parameter Ag'^/ (a simplified 
desrriplinn of ihe mismalrh NN parameters in |7, I0|) 
to account for the point defect at the defect position x (a 
similar approach is described in |2'))). 

An analytical derivation of the DPI for homopolymer 
sequences shows that the panition funaion (provided as 
a function of defect position - see F.q 8) is increased for 
defects located near the duplex ends. 



del 
RT . 



(8) 

In tq, 8 the defect impact S&g'^,/ = &g'^f ~ has 
been factored uul, revealing a general (defect-type 
independent) position dependence that is largely gov- 
erned by the distance between ihe defect at position 
X from the dupleJt ends Pefens proximate to the duplex 
ends increase end-domain opening. The panition 
function IS increased due to the number of thermally 
populated (panially denatured) duplex states. The defect 
destabilizaiion SAg'^^i determines how far is 
elevated in respect to the perfect match panition 
function (Z,.« » 1) and thus how far the DPI propagates 
into the interior of the duplex With Eqs 7 and 8 we 
obtain an expression for the DPI on the duplex binding 
constant K{x) 



! 



fiT 



RT 



K = - 



RT 



- 1 



t- I 



, RT 



6Ag 



def 



RT 



Fig, S illustrates F.q 9 for iwo different duplex siabllities. 

While defect.^ near the duplex ends result in low 
mismatch discriminaiion only (i e small redunion nf 
K with re.specl to the PM binding affinity) defects in the 
center result in higher MM discrimination as K then 
approaches the value of the iwo siale equilibrium 
constant NN-pair free energy increments 5Asj,/ ^ for 
single base MMs are in ihe range of 1 lo 3 kcal/mol per 
NN pair (derived from NN parameters |8, 10|). 



■0 B kcal/mot 




<Mf«cl posittcwi 



<)efect posllkin n 



Figure 5 

Positional influence of single base MM defects on the 
duplex stability for two different NN pair free 
energies if' at a temperature of 3 10 K Curves o to 
/■correspond to defect destabilliacion values SAg'^f of 0 to 
S kcal/mol (incrementally increased by I kcal/mol). Defect 
destabiliiation S&gj,f is quoted per affected NN pair. 
(A) Af = -1.4 kcal/mol, this corresponds to an average 
NN-pair free energy; (B) Ag' = -0.8 kcal/mol corresponding 
to a weakly bound sequence of A T and T A base pairs. 



Employing these values in Eq 9 for Ag' ■=-14 kcal/ 
mol (Fig. 5A), DPI propagation is restricted to 3 or 6 NN- 
pairs, respectively However, in subsequences with 
weakly bound NN-pairs (as demonstrated in Fig. 3B) 
the DPI can propagate further towards the middle of the 
duplex 

Relation Between the Hybrlddotlon Signal and the 
Binding Free Energy dGj, 

In order to compare our numerical analysis lo the 
experimentally observed hybridization signals we need 
to understand how the hybridization signal (fluores- 
cence intensity from hybridized targets) is linked lo 
duplex stability As detailed below the assumption of a 
single (homogeneous) binding affinity within a micro- 
array feature of the Ungmuir adsorpuon model does not 
describe the experlmeiiialiy observed hybridization 
signal inlensiiies well. In this section we account for 
the heterogeneity that is introduced by in ji'im synthesis 
related random mutations of the microarray probe 
sequences 

The importance of the adsorption model for ihe 
description uf microarray hybridization has been dis- 
cus,sed previously in |36-19| In the simplest de.srripiion 
the equilibrium between single stranded probes and 
targets and hybridized duplexes T P s D can be 
described by a l.angmuir-type adsorption isotherm (F.q 
10) Under our experimental conditions targets were in 
sufficient excess, the target concentration |T| >■ |Tn| can 
be taken as consiani. Since the hybridization signal 
intensity is expected to be proportional to the fraction ol 
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hybridized probes 0 = |DI/|Pol we will in the following 
refer to (/ as the hybridization signal. 



ending frae energy aG^ (kcal/motl 
0 JO -20 -30 



KjTo] 
l + K|Toj 



(10) 



Taking - ^ -»c;,/fiT we obtain a sigmoidal relation 
beiween the hybridization signal and duplex free energy 



-ACb/RT, 



(11) 



Our expeninenial data suggest an approximately linear 
relation between the hybridization signal and the duplex 
binding free energies (within the free energy range 
roverrd by the dcfert profiles). However, with F,q 1 1 
an approximately linear relation between t) and iCp is 
only provided within a narrow range MC • k kral/mol 
(at r = .110 K and |Tol = ' nM) Thi.s cannot reproduce 
the experimentally observed DPI of the hybridization 
signal, since the free energy range of the defect profile 
exceeds the transition region. To investigate how the 
fluorescence intensity of hybridized targets is related to 
duplex stabiliry on the microarray surface we performed 
a hybridization assay comprising sets of probes in which 
the probe length (assumed lo be roughly proponional to 
duplex free energy) is incrementally increased (inset in 
Fig, 6] The experimental results in Fig 6 show a sigmoid 
relation between the hybridization signal and probe 
length However the transition region extends over at 
least 13 base pairs ( diCu37 « 20 kcal/mol) over which a 
monotonous increase of the hybridization signal is 
observed. In agreement with our findings a linear 
relation between microarray hybridization signals (on 
spotted microarrays) and duplex binding free energies 
AGq (derived from caloriineiric measurements) has 
been reported recently by Fish ef at. [40). The large 
deviation from the Langmuir-equaiion agrees with 
previous observations \25, A\\ An effective isotherm 
with a broadened transition region, a Sips-isotherm, has 
been reponcd |2S, 42, 43| lo provide a belter description 
of surface hybridization on microarrays This isolherm 
can result from a heierogeneou.s, g.iussian distribution of 
binding affinities. Reasons given for the helerogeneiry 
include variation of the probe local environmeni, surf,ice 
elenrostatics |44| and eniropic blockage |4'i|. As we 
show in the following a major contribution to the 
heterogeneity of binding affinities is probe polydisper- 
sity |25, 38, 46. 47|, which is a result of sequence defects 
generated in the in s\tu synihesis process of DNA Chips, 
which introduces single base mismatches, base bulges 
and truncations. 




10 IS 20 25 
PiDbe length (nucteolides) 

Figure 6 

Hybridization signal versus duplex stability. The 

sigmoid transfer function i9( &C'r, ) of the Ungmuir isolherm 
(right scak} has a narrow transition region ( S&G'o ' 6 kcal/ 
mol at a temperature of 3 10 K and a target concentration of 
I nM). Microarray hybridization signals (kfi sco(e) for 
Incremenully increased duplex lUbilities: The probe 
sequence motif vras u-jnslated along the target sequence in 
Increments of two bases (see inset), thus providing a set of 
different curves Alt probes were hybridized with the 
common target sequence UKA (I nM in 5 x SSPE, for 20 
minutes at 45'C). The approximately linear increase of the 
hybridization signal in the transition region extends over at 
least 13 base pairs (cftCpjj * 20 kcal/mol). 



Assuming a .stepwise error rate of 10%, more than 90% 
of the 25 mer duplexes contain at least one synthesis 
error 1 13], Since the number of synthesis errors per probe 
follows a binomial distribution, the majority nf the 
strands contains between one and three single base 
defects. 

We calculate binding constants K, of the individual, 
randomly "mutated" probe sequences on the basis of the 
zipper model. Using the approach of 1-orman ei al |48( 
we obtain the total hybridization signal by summing up 
over the dislnbuiion of probes, where the contribution 
of each individual mutated probe 0, is described by a 
Langmnuir equation (Fq. 10) with the binding constant 
K,. Probe polydispersity (in length as well as in 
sequence) reproduces a "stretched Isotherm- |47| (simi- 
lar to a Sips isotherm), with a significantly broadened 
transition region This explains our experimental results 
in Figs. 6 and 2 well. A simulation of the transfer 
funaion 0{ acf, ) for various error rates and a compar- 
ison beiween ihe ejcperimenlal data in Fig 6 and the 
corresponding simulation results arc provided in Addi- 
tional file I, Fig SG 
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Numerical Anolysls of Mismamhcd Duplex 
Stobility-Comparlion with Experimental Retulti 

To model experimenlal rcsuUs with ihe partition 
function approach we choose she NN free energy of the 
mismairhed base pair Aj^,^ as a free parameicr A^j,^ - 
I kcal/mol (ai T = 325 K) describes our experimental 
observations (m particular the dominating positional 
influence with respect to defect type-related influences) 
best - see Fig, 3 This value is also in good agreement 
with bulk solution parameters [10) Results of Ifie 
numerical simulation (in Fig 3A) demonstrate that the 
shallower slope of the hybridization signal at ihe right 
duplex-end corresponds to a series of weak NN pairs (as 
anticipated by Eq, 9). The panition function Z{x) largely 
determines the positional influence. Additionally, as 
shown in pig. 3B, defect-type related infltjence (the 
difference between MM and PM free energies ddji' affects 
the statistical weight of the completely dissociated 
duplex wn) is refleaed in the hybridization affinity K 
{x) and in the hybridization signal f?(j) In addition to 
single base pair defects our binding affinity model 
reproduces well our experimental results on the binding 
affinities of oligonucleotide duplexes with two single 
base deletion defects (for details see Additional file 1, 
F,g S5) 

In Fig 7 we investigate the influence of heterogeneous 
probe-target binding affinities (see previous paragraph) 
on the shape of the defea profile If the range of the 
mismatched duplex free energies is within the transition 



A B 




Figure 7 

Influence of the synthesis error rate on the shape of 
the single base mismatch defect profile, The defect 
profiles (which correspond co che experlmenul data in Fig, 3 
{A) and (8)) were calculated for error rates between 0 and 
20 percent (per nucleotide coupling step). In (A) a positional 
influence is rather independent of the error rate - the duplex 
free energy range covered by the defect profile is within the 
approx. linear transition region. Whereas in (8) at an error 
rate of 0 percent, the free energy range of the defect profile 
doesn't match the transition region - the positional influence 
is hardly visible. At larger error rates the positional influence 
becomes dominating over the defect-type related influence. 



region we observe an approximately linear relation 
between the hybridization intensity and the binding 
free energy |40| If the defect profiles fiee energy range 
exceeds the narrow transition region (like for example 
Fig 7B, at an error rate of 0 percent) the positional 
influenre remains hardly visible. 

/»p(>roK(mat/on of the Zipper Model with a Poiltlon 
Dependent NearetX-Nelghbor (PDNN) Model 

In order to investigate the generality of our finding, we 
investigate if PDNN models, which fit expenmental data 
well, can be inferred from our model framework. We 
note that zippering has been previously proposed as the 
rationale behind the PDNN model in |25|, 

In the following we invesiigaie the contribution of each 
base pair to duplex stability and ask if there is a position 
dependent contribution of Watson-Crick NN pairs in the 
same way as for defects. 

This idea is the basis of the PDNN model |1 7, 21, 4 1 1 in 
which ACy is obtained as a position-dependent 
weighted sum of nearest-neighbor free energies, 

N 

AGh = ^wM: (12) 
1=1 

Following our theoretical approach we create a set of 
7500 oligonucleotide duplexes assembled from a given 
set of NN pairs. Although the TSNN (two-state nearest- 
neighbor) free energy of these duplexes is identical, the 
calculation with the zipper model indicates significant 
differences among the stabilities of the individual duplex 
sequences (see Additional file I, Fig S7), We investigate 
the positional dislribulion of NN pairs in the weakest/ 
strongest 5% of the duplexes We find that in the most 
stable duplexes the stronger NN-psirs are located in the 
center whereas in the least stable duplpjtes the strong 
NN-pairs are located near the duplex ends. This result 
has been reproduced with the partition function based 
UNAfold software (DINAMelt web server (9|) with 
excellent agreement to the zipper model A similar 
invesligalion (see Fig, 8) employing a set of random 
duplexes composed of nonidentical NN-pairs confirms 
the result. In Additional file 1, Fig S8 we show that 
duplex free energy values determined with the zipper 
model can indeed be approximated with a PDNN 
model The positional weights - described by a parabolir 
function iv,{x) - have their maximum in the middle of 
the duplex. The results in Figs 8 and Additional file I, 
Fig S8 indicate that the contribution of ihe outer NN- 
posiiions to duplex stability decreases with increasing 
temperature. At 340 K ihe three outermost NN pairs 
(which is in total six of 24 NN pairs) have a significantly 
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Figure B 

Comparison of the two-state nearest neighbor 
(TSNN) model and the zipper model (partition 
function approach - PFA) To investigate for which 
sequences the difference between TSNN free energies and 
PFA free energies is largest, we have created a large set of 
5000 random 25 mer sequences with a similar nucleobase 
composition Scatter plots of TSNN free energies versus PFA 
free energies (hft) show a very good correlation at a 
temperature of 310 K {A). At higher temperatures (340 K 
and 360 K shown in B and Q we find significant deviations 
between the two models, We have selected the 5% of 
sequences with the largest residuals (highlighted by red 
symbols) and determined the posit/on-dependeni distribution 
of NN free energies (shown right) by averaging (—> averaged 
NN pair free energy versus NN-paIr position. The GIbbs free 
energies In upper, middle and lower plots refer to 
temperatures T = 310 K, 340 K and 360 K, respectively). At 
310 K the sequences with the most stable fiCpFA have their 
weak NN pairs at the outermost two base positions (doshed 
block Zinc) and therefore the more strongly binding NN pairs 
in the interior. Vice versa sequences with the weakest iCpf^ 
(so/id green line) have strong NN pairs located at the 
outermost positions. The mean NN free energy (average 
over all sequences) is indicated by the doned red line. At 340 K 
for the most stable sequences (according to PFA) the weakest 
NN-pairs are concentrated at the six outermost base poslaons 
(at each duplex end). At 360 K (which is above the melting 
temperature of the duplexes) the NN pair stabilities follow a 
parabolic position dependence 



reduced contribution to duplex free energy At a still 
lower temperature of about 3 10 K the positional weights 
converge to w,(x) - 1, which is equivalent to the TSNN 
model 



Conclusion 

In this paper we studied, experimentally and theoreti- 
cally, the stability of shon (/ < 26 bp) linear surface- 
bound oligonucleotide duplexes with single base defects. 
We demonsiraled that the rationale behind positional 
dependent models of oligonucleotide duplex stability is 
the pani.il denaturalion of the duplexes We have .shown, 
that the strong influence of the defect position on 
mismatch discriminalion (11-14, 16, 49| and the 
influence of the sequence coiitexl - beyond nearest 
neighbors 1 14, 34 1 can be quantitatively inferred from a 
molecular zipper model. Partial (end domain-]deiiatura- 
tion of the duplex as proposed by us in |2fi) as well as in 
1 1 6, 24, 25) results in a positional influence thai i.s 
emropic in nature The zipping process is modulated by 
the sequential arrangement of the ba.ie pairs The model 
confirms the observed influence of the sequence context 
beyond the nearest-neighbors. Further the zipper model 
provides a theoretical foundation lo the positional 
dependent nearest-neighbor model of Zhang ei al. j 1 7\. 

In the commonly employed two-state nearest-neighbor 
model, nucleic acid duplex hybridzalion/denaluration is 
considered to be an all-or-none process According to 
literature indeed end-fraying effcns are expected lo be 
small beyond three bases |34|, however, in our studied 
case, we conclude that end-frayiiig plays a non-negligible 
role '(Tiis is surprising since the dissociation probability 
of individual base pairs decreases towards the center of 
the duplex in an exponenuaJ fashion (see Additional file 1, 
Fig, S4) and remains very low for most NN-pairs 

We propcie that the effect of the defect position on 
probe-target binding affinities becomes apparent in the 
hybridization signal intensities due to the unavoidable 
probe polydispersity of optical synthesis It indeed 
appears that the positional dependence of single base 
MM discrimination is more commonly observed on 
phoiolithographically produced DNA oligonucleotide 
arrays |11-I4| rather than (in large scale studies) on 
spotted microarrays |40, 50, 5I| or in solution-phase 
experiments We notice, however, that in .small .studies 
(investigating few sequences) a positional influence in 
solution |52| and on spotted microanays (49| has been 
reponed- The probe polydispersity in our experiments 
smoothes out the sleep sigmoid relation between the 
hybridization intensity and binding free energy iC,, that is 
expected for defect free probes, and explain,? why (within a 
relatively broad range of efiC'D,, ' 20 kcal/mol) vanation,'; 
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of the binding free energies - like for e:<ample the influence 
of lh« defect position - are reflected (by means of an 
approxamately linear relation) in the hybridization signal 
intensities 
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ABSTRACT 

Hybridization of rRNAs to microarrays Is a promising 
approach (or prokaryotic and eukaryotic species 
Identification. Typically, tf>e amount of bound target 
is measured by fluorescent intensity and It Is assumed 
that the signal Intensity Is directly related to tf>o target 
concentration. Using thirteen dltlerent eukaryotic 
LSU rRNA target sequences and 7693 short perfect 
match oligonucleotide probes, we have assessed 
current approaches lor predicting signal Intensities 
by comparing Gibbs free energy (A<y) calculations to 
experimental results. Our evaluation revealed a poor 
statistical relationship between predicted and actual 
Intensities. Although signal Intensities for a given 
target varied up to 70-fold, none of the predictors 
were able to fully explain this variation. Also, no 
combination of different free energy terms, as 
assessed by principal component and neural 
network analyses, provided a reliable predictor of 
hybridization eHlciency. Wo also examined the 
effects of single-base pair mismatch (MM) (all pos- 
sible types and positions) on signal Intensities of 
duplexes. We found that the MM effects differ from 
those that were predicted from solution-based hybri- 
dizations. These results recommend against the 
application of probe design software tools that use 
thermodynamic parameters to assess probe quality 
for species Idenllflcatlon. Our results Imply that 
the thermodynamic properties of ollgonucleollde 
hybridization are by far not yet understood. 



INTRODUCTION 

High Itirotighput technologies, such as DNA microarrays, have 
signifieani polenlial for identifying organisms in many areai. of 
biomedical science, including health care, biological defense 
and environmental monitoring Several microarray plalform.s 
are currcnlly u.scd: doi blois on synthetic membranev fir planar 
arrays (1,2) and gel-pad microarraysi on gla.ss slide (3-5) In 
addition, several platforms are under developineni: microbcad 
microarrays (6.7) and electronic (8,9) and cantilever arrays 
(10), All plaiforms .share the common aliribule ihiit a sen.'.or 
delects a signal from target sequence* hybridi/ed in immobi- 
liz-ed oligonucleotide probes. The iniensiiy of (his signal pro- 
vides a mea.sure of the amount of bound nucleic acid from a 
sample. 

Ribo.somal RNA.s (rRNA) are panicularly suitable for spe- 
cies idcniificauon procedures, because they occur universally, 
contain conserved as well as divergent regions, and are highly 
abundant in cells, Identification of microorganisms relics 
heavily t>n rRNA hybridization .schemes (11,12), while 
applications for small eukaryotic sol) or water organisms 
are currently emerging (14,15,18). The promise of these laiici 
applications is that PGR amplification steps may not be 
required for detection, since multicellular organisms contain 
a sufficient amount of rRNA to allow direct detection of single 
individuals on a microarray platform (11-13,18). 

In comparison to standard microarray applicaiion.s for 
detecting specilic mRNAs. there arc extended requirements 
for the specific and reliable detection of organisms, First, since 
it IS necessary to potentially distinguish closely related species, 
which differ only at a few nucleotide positions, one can only 
use relatively short oligonuclcolidcs as probes, to ensure spe- 
cificity. Second, becau.se of the same reason, one has often 
only a limited set of options for choosing specific probes And 
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finally, il is of particular Imponance thai Ihe specific probes 
yield a high signal lo noise ratio, I.e can discriminate 
accurately between perfectly matching and slightly mismatch- 
ing targets. 

Accordingly, it is necessary to have a reliable predictor for 
the hybridization performance of a specific probe Tiling 
experiments with probes along specific mRNAs have shown 
that there can he huge differences in hybridisation efficiency 
of probes (19,20). Funhcmiore, il has become clear that the 
simple notion that short oligonucleotide.s with a mi.smatch 
(MM) should hybndi/e less efficiently than perfect match 
(PM) probes IS nut always applicable It has been shown 
that the hybridization intensity of MM probes can depend 
on the nucleotide type (i.e. A. C, G or T) and position of 
the MM relative (o the tcmiini (4.16) and that some MM 
probes yield higher signal intensities to the target than 
those of corresponding PM probes ( I 7) 

The focus of this study was to a.s.sess the utility of in silica 
predictions of probe-targel duplex stabilities using DNA 
microarrays for detecting rRNA sequences in the context of 
possible applications for species identification. In particular, 
wc investigate how well one can predict the hybridization 
performance of particular probes in the context of secondary 
structure predictions for the rRNA. In addition, we study the 
effect.s of single-base pair mismatches of all possible types and 
posiiion.s on probe-targei hybridizations. 

Our specific objectives were (i) lo generate a set of probes 
forming a PM with target rRNA sequences, (ii) lo measure ihc 
signal intensity of each probe on a microarray and to correlate 
(luorcsccnl iniensity values to iheorclically-colculatcd duplex 
stahiliiy measures and (In) lo systematically assess the 
influence of single-basc pair mismatches on signal intensity 
values ol known target sequences 

We report lack of a simple relationship between hybridiza- 
lions of probc targcl duplexes as inferred from signal intensity 
values and in silica predictions based on Gibbs free energies 
On the other hand, we can show that type and position of the 
MM significantly affecis signal intensities of uirgci sequences. 
Most interestingly, the order of stabilities of MM pairs in 
microarrays are different from thai observed in solution, 
with pyrimidine-pyrimidine MM pairs being more stable 
than purine-purine pairs However, even for these results 
Ihe variances were high and cannot be explained for each 
individual oligonucleotide. Hence, it is currently noi possible 
to predict in siliro the performance of panicular probes m 
microarray experiments. Accordingly we conclude that 
microarray designs for organism identification via rRNA 
hybridisation will require meticulous testing of all pos.sible 
oligonucleotide combinations, 

MATERIALS AND METHODS 
Kxperimental material 

The ribosomal rRNA targets were derived from two different 
projects. For the lirsi project, wc used D3-D5 expansion seg- 
ment fragments from the L.SU of organisms that are present in 
the meiobenlhos (15,18), The.se experiments were done in 
conjunciion with Febit GmbH (Heidelberg), which includes 
also the systematic study of PM versus MM comparisons in 
a ,second project, we have u.sed DI-D2 expansion segment 
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fragments from nine nematode species, for which we construc- 
ted PM tiling arrays in conjunction with NimbleOen Systems 
Inc. (Madison). 

Target preparation 

For Ihc first set of experiments, cloned rDNA fragments (18) 
from four organisms were used (Table )). The sequences were 
cloned into a pZErO-2 vector (Invitrogen Inc ). Depending on 
the orientation of the insen, Ihe plasmids were cut with cither 
Spel or Xbal restriction enzymes and in viim transcribed with 
SP6 or T7 RNA polymerase, respectively. The traiiscnpiion 
and labeling mix contained 1 8 M-1 of a master-mix (10 mM 
ATP, CTP, GTP 8 ^1 each; 10 mM UTP 6 1 mM Chroma- 
Tide Alexa Fluor 546- 1 4- UTP 20 ^tl; lOx Tran.scription buffer 
16 Hi; 40u/Hl RNasin 8 2 Jll of SP6 or T7 polymera.sc 
30 ul\x\: and 20 \x\ of the linearized plasmid ai 50 n%l\x\. 

For the second sel of oxpcnmcnls, nbosomal rRNA tem- 
plates from nine nematode species were derived from a project 
in which the D1-D2 region of the L.SU rRNA was sequenced 
(Table I). The sequences were amplified using universal 
primers (28sFw-tuilT3 5'-AATTAACC(rrCA(7rAAAGGG- 
AGCGGAGOAAAACAAACTA-3'; 28sRew S'-TACTAGA- 
AOGTTCGATTAGTC-.V) of which Ihe forward primer 
carries a tail with a T3-RNA Polymerase initiation sue a( 
Its 5' end. PGR products obtained with these primers 
were directly used for in vilio iranscription. The U-dii.scrip- 
tion was performed with the MEGAscripi Kit (Ambion) 
according to ihc instructions of the supplier. The master-mix 
was supplemented with 1.875 mM biotin-conjugaied UTP 
(PerkinElmcr) and 1.875 mM biotin-conjugated C7P 
(PerkinElmer) to label all tran.scripis 

Hybridization 

Each of the rRNAs were diluted in hybridization solution 
(5x SSC. 0.2 mg/ml BSA, 12 mM ribonuclcase inhibitor— 
Ribonucleoside Vanadyl Complex; New England Biolahs) to u 
final volume of 100 Ml (3.75 ng/^tl RNA) and healed to 80'C 
for 1 min The following hybndization and washing protocol 
was used: (i) the microarrays were preheated to 70"C, (li) the 
hybridization solution was added to each microarray and the 
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microarruys were incubated a( SO"C for I min. (iii) a 
low-s!ringency hybridizalion was performed by ineubaling 
Che microaiTays at '15"C for 24 h, (iv) ihe micrQarrays were 
then washed with a lt)w-siringency buffer (5x SSC at ZCC, 
3-rold volume exchange), (v) the first image of the rnitroBrrays 
was recorded, (vi) the microarrays were washed with a 
high-stringency buffer (O.lx SSC ui 20°C, 3-fold volume 
exchange) and (vii) a second image of the microarrays was 
recorded. 

Hybndi/.alion on the NimbleGen platform was performed 
according lo the protocol routinely used at NimbleGen. 
Briefly, each biotin-labelcd rRNA target wa.s separately hybri- 
dized to the specific compartment on ihc 12-well NimbleGen 
array la single array with 1 2 compartments physically isolated 
from each other), such that no interference between targets 
was allowed Hybridization conditions were similar to that of 
Febil mieroaiTay, namely 45°C, I M Na', After 16-20 h 
hybridiialion, the microarray was washed with non-stringent 
and siringeni buffers and images were recorded 

Probe design 

A set of oligonucleotide probes was generated using a C-w- 
program specifically written for this study. The set consisted of 
PM 20mer probes that were complementary to the rRNA tar- 
get.*, (see Target preparation section) Randomly selected 20 ni 
long poriinnbof the target were considered as potential hybrid- 
i/aiion sues In addition lo ihc PM probes, single- MM variants 
were designed The entire array of these variants made up a 
complete set to mvesligntc Ihc effects of every position of the 
2()mcr and every lype of the MM on signal inlcnsily values. All 
probes were rcplicolcd lour limes to provide a measure of 
intra- micrimrray reproducibility. In total, 42 456 oligonuc- 
leotide probes were synihesiied by the GENfOM One* instru- 
ment (Pebil GmbH, Heidelberg. Germany) on the microarray 
as described previously (19). 

The probes for Ihe NimbleGen csperimenis were designed 
a.s a tiling set 1 1 nt shift) of perfectly matching 25 nt oligo- 
nucleotides to the rRNA sequences of the nine ncmaiodes. In 
lota). 75 1 9 oligonucleotides were synthesized on the surface of 
the 12-well NimbleGen array (a single array with 12 compnn- 
ments physically isolated from each other), each well contain- 
ing the full set of oligonucleotides. 

Ollgunucleolldc arrays 

A light-activated m situ oligonucleotide synthesis was per- 
formed within Ihc GENIOM insirumciil on Ihe activated 
3D reaction carrier, which contained a glass-silicon-glass 
sandwich, using a rtigital micromirror device (Texas Insini- 
ments). Four individually accessible microchannels (refeired 
10 as arrays) were etched into Ihc silicon layer of the DNA 
proces.sor and connected to ihe microlluidic system of the 
GF.NIOM insiriimcnl acting as a custom DNA synlhcsizcr. 
Oligonucleotides were synthesized using standard DNA syn- 
thesis reagents and RayDite 3' phosphoramidiics. carrying a 
5'-phoiolabile protective group (Proligo LLC; Boulder, CO, 
USA) Prior synthesis, Ihe array surface was activated and 
enough distance between oligonucleotides was secured with 
a spacer to facilitate prohe-largcl interaction and avoid 
prohe-probe interference, 



Thermodynamic calculations 

The following thermodynamic parameters were calculated 
using diffcreni software tools: free energies of probc-tiirgci 
binding (AC°i,) and probe-probe dimeriz.alion (AG u) ai 4.'i"C 
were calculated using an Excel macro written by Mulvccva 
el al. (21 ); free energy of scK-looping probes (ACp) at 4.'i C 
was determined by Mfold program (22). In addition, free 
energy of the local denaturaiion of the target rRNA (.if; ,). 
and the overall free energy of probe-target binding (AG i,h) 
resulting from the con.sideration of all competing proces.ses 
(i.e. AG 'ti, A6"'k, AG°p. see Discussion), were calculated using 
RNAsiruciure v. 4.2 |(23), set with a hued temperature of 37"C 
and a probe concentration of I \xM, and I M Na'l All tools 
used the Nearesi-Neighbor model. 

Secondary structure prediction 

The .secondary structure of rRNA was deiermined by two 
altci-nalive methods. Firsi, the sequences were aligned lo 
the b«.si BLAST maich from the European Ribosomal RNA 
Database (24), which contains an alignment of numerous LSU 
rRNA sequences with annotated .secondary structure ,Second, 
the rRNA targets were allowed to attain their lowest energy 
■iiaie. The free energies of the altemaiive folding were calcu 
laled using RNA folding software (RNASir\icture) 

Secondary structure of Ihc targets used fur hyhi idizaiioii 
with NimbleGen arrays was predicted only by energy mini- 
muation algorithm due lo Ihc lack of inlormalKm jboui 
experimentally determined secondary structure of the 
DI-D2 expansion segment 

Data managemeni and statistical analy.sis 

The data were stored in a rclaiional database created in Micro- 
soft Access, which is available at hltp://faculty. Washington, 
edu/pozhil/default.htm. The data were extracted through 
queries and analyzed in Micro.sofi Excel and SA.S (Gary, 
N.C.). Principal component analysis (PCA) was employed 
to examine the di.stribution of the variables relative to signal 
intensity vanables and lo construct ordination plots. Pearson 
produce-moment correlation was used lo determine the degree 
of a.ssoclation between variables, Linear regressions were used 
10 estimate the relationship of one variable to another (25). The 
diiluscts were prepared for the ANOVA in ihe way that signal 
intensities of all duplexes where averaged using the median 
values of Ihe four replicates. Median was used, as a measure of 
central tendency, which is less sensitive lo outliers to account 
for possible hybridization ariefacis Median values ofcveiy 
probe containing a MM were normalized using the median of 
corresponding perfectly matched probe These rormali/.cU 
values where then analyzed by three-way ANOVA using 
MM position, MM lype and lype of neighboring nucleotides 
(NN) as fixed factors. NN where defined as nucleotides lociilcd 
on Ihe probe sirand one position left and right from a MM 
position. Partial Eta squared iry) was used as a measure ol the 
degree of association between normalized signal intensity and 
analyzed factors. The Hochbcrg's GT2 test was u.sed for post- 
hoc analysis of contrast and pair-wise comparisons between 
means. 

An artificial neural network (ANN) package (Neurnct, 26) 
was used to investigate the nonlinear relationships among 
input variables (i.e. AC" values) and outputs (i.e signal 
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imcnsiiy values) Unlcbi olhcrwise specified, ihe following 
sellings were used for training NNs: inpul and oulpui scaling 
was SCI 10 siandard linear (0,1), ihe logislic iransfcr (unc- 
lion was used for hidden neurons and pure linear iransfcr 
funciion was used lor ouipui neurons; 80% of ihe daia were 
used for Iraining, 10% was used for icsling and 10% was 
used for vahdaiing Ihc NN; and, Lcvcnberg-Marquardi 
error inmrnii/.alion was used lo Irain Ihe NN, The archileciures 
of all NNs were opiinii/ed prior lo conducling analyses by 
adjusting ihc number of hidden neurons (I to 8) and idcnli- 
fyirp Ihe archiu-ciure lhal provided Ihe bc.si predictive 
model. Comparison ol dilTcrcni predictive modcLs was 
conducied hy compuimg Ihcir median Akaike's Informaiion 
Criterion corrcticd (AlCcI value (27) and determining the 
probabihiy thai one motiel was belier than another. The 
model yielding the lowest AlCc ,scorc contained the optimal 
number of hidden neurons 
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Given that rRNA is known lo form by fur more extensive 
secondary structures than mRNA, we reasoned that if iheic 
would be any calculable effect o( secondary structure on 
hybridi/aiion efficiency, it should be most pronounced lor 
rRNA targets. Thus, in nddilion lo calculating the parameters 
suggested by Matveeva ci al (1 9) and Luebke fi al (20), we 
con-sidcrcd al.so the free energy of the secondary structure of 
Ihe rRNA, 



RESULTS 

In our first experiment, wc conslructcd a scl of PM probes for 
fourdilTerenI LSU rRNA Iragment-s from melobcnthos organ- 
isms and synthesized every possible MM combination for all 
PM probes (Table I ), The hybridization profiles of PM probes 
(o their re,speetive target revealed large differences (up to 
70-folcl) in signal micn,sities by alignment position 
(Figure I), similarly to what has been oKserved previously 
with mRNA targets (19,20) Matveeva ei iil. (19) had sug- 
gested thai Ihermodynamic properties of probe folding and 
probe hybridisation could partly explain these differences in 
hyhridi/aiion efficiency Luebke ci a/ (20) suggested that Ihe 
predicted (ice energy of hybridi/uiion minus the predicted free 
energy for intiamolecular folding of the probe provides a 
partial explanutioii, while no consistent correlation was 
found with Ihe secondary siructure of the mRNA largeis 



Relationship of Glbbs free energy term.s (o signal 
intensity values of PM duplexes 

To ensure that all po.ssibic known parameters are assessed, wc 
calculated various Gibbs free energy tenns singly or in com- 
bination using three different program,s, which all consider 
nearest-neighbor models (see Malcrials and Methods) This 
includes the predicted free energy of hybridization (probe- 
target binding— AC 'b), probe hybridization (probe-probe 
dimcrizution— AG'j), free energy for intramolecular folding 
of the probes (self-looping of probes—AC",,), free energy of 
Ihe local dcnaturation of the target rRNA (AO'",) and the over- 
all free energy of probe-target binding (AC/'-.j^) resultme from 
Ihe considoralion of all competing processes (i c &(! ^'^C- 
AO",,, see Discus.sion) For considering secondary structure 
elements in rRNA, one can either use the secondary structure 
predictions inferred from alignmcni,s and experimeninl valida- 
tion m ribosomes |lakcn from The European Ribosomal RNA 
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Daiabnse', (24)|, or ihe prccliclion.s derived from a folding 
algorithm lliai minimizes Gibbs free energy of ihc siruclurc 
jRNAsiruclurc, (23)1. A comparison of Ihe free energies cal- 
culaied for sc<;ond[!ry ,'*iruciure predicied by alignment and 
thai of predicted by minimum energy revealed that the 
alignmcnl dcfined sccondai7 structure produced folds that 
were sjgnificanlly different from their energy minimum 
(Tabic 2). This finding is consistent with the notion (hat the 
lowest energy slate is not necessarily attained by mature rRNA 
and suggests thai rRNA reaches a conformation that is 
between thcie CMrcme.'. li e. those based on alignment prcdic- 
Hons and those based on the energy minimum). However. Ihc 
iwo versions of calculalion that we use here are the only ones 
available based on ihe current knowledge 

Linear and nonlinear regression (polynomial, up lo three 
icnns) modeU were used to assess (he relationship between the 
various AO" terms and signal iiHeiisiiy values of probc-targel 
duplexes. In general, ihe models poorly explained Ihe relation- 
ship between AO"" terms and signal intensity values, regardless 
of microarray platform used (Fcbil or NimhleGcn-~sce 
below), software package, washing conditions, target 
sequence or wheiher or noi secondary structure of the RNA 
was considered when AG' was calculated (Table 3), Polyno- 
ininl models did noi hi the data (data not shown) and therefore 
were not funhcr considered. One example of a weak linear 
corrclniioi) is shown for the relationship for AC ot, and signal 
inicnsiiy values for sequence I (Figure 2), In this case, up to 
30% of the variability in the data were explained, while all 
other corrclalions for (his sequence and the other sequences 
were worse (Table 3) 

Becau.sc ihe firsi experimeni included only relatively few 
PM probes, we sought lo coiroborate these hndings with a 
secondexpenmeni, involving 75 19 addilional PM probes from 
nematodes (Tabic )). In this cuperimcni, ihe /f -values for 
certain AC'' terms on (he nematode .sequences explained as 
much as 74% oi the variability of Ihe data (Tab)c 4). However, 
this was an exception rather than the rale since many /?^values 
19 out of 63) were not sia(is(icallv significan(. No(e 
(hai in ihc ca.se of Rhahdtiis lemcula, ACob had no relation 
with signal imcnsily. It is particularly surprising since in the- 
ory, ihe AGob should aecoun( for more variability than all 
o(hcr (erms, bu( (his is not the case, supporting the notion 
that predicied (hermodynamic parameters do not accurately 




FlSiire 2. Kelaorvnihip heiwccn af,;'» and .sijnal .nicnMi> lui PM pmhn 
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*ii,slr, da,shed irend line, slnngcni wa.sh 



predict signal intensity values of duplexes wilh rRNAs in ihis 
experiment as well. 

These results arc somewhat in conirasi to ihc results Ironi 
Malvecva ei al (19) and Lucbkc er al (20), who found con- 
sis(cnily weak correlations for the free energy terms ihey 
tested. However, Ihe magnitudcsoflheircnrrclationsare within 
Ihe range of the subset of experimcn(s, where we also found 
some correlations In balance, we can conclude from ihesc 
results ihat signal imcnsily values for rRNA hybri<li/aimns 
are only poorly predicted by in silno software packages 

Since individual free energy paramelers are such poor pre- 
dictors, Luebke ci al (20) proposed a linear eomhinaiiun o) 
two parameters, namely (he predicied free energy of hybrid- 
i?.aiion fA(7°bl minus the predicied free energy for intramole- 
cular folding of the probe (ACp). as a reasonably good 
predictor of hyhridizadon mieii.sity. However, this is only 
one of all possible combinaiion.s of the parameters. To sys- 
tematically evaluate all possible linear combinations of indi- 
vidual parameters, we employed a PCA. which can find even 
hidden relationships, 

The initial PCA analysis involved constructing 2D ordina- 
tion plots of AC° terms and GC values and color-coding each 
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poinl on (he plow by ils corresponding signal intensify value. 
Examination of ihe lour oriiinalion plol.s revealed no obvious 
rclalionship between any of Ihe variables and signal intensity 
vHlues (data not shown). To more thoroughly investigate the 
relationship between S>G" icrms and signal intensity values, 
signal mtcnsiiy values were included as a variable in PCA, 
PCA results of ihc data from differeni large! sequences 
revealed (hul 1H-H2% of ihe loial matrix variance was 
explained by three principal axes, with PCI explaining 
^^-^9%. PCJ explaining 20-29%, and PC3 cuplaining 
IS-SO'T'f of ihc loliil mairix variance (Table 5), However, 
Pearson conclatitin cwrficicnls of ihe variables relative lo 
the PC axes revealed inconsisieni results for ihe data from 
different largei sequences. For example, in the case of 
sequences I and A, PCI was mosi strongly positively corre- 
laled lo A6 ',. while sequences 2 and 3 PCI was negaiivcly 
correlated lo AG"', For sequences 2 and 3, AO""™, was most 
strongly correlated lo PCI. while this was negaiivcly corre- 
lated for sequences I and 4, Similar resulis were also obtained 
for ihe other PC axes, indicating differences in (he ordination 
of variables lor daia from dilTereni targci sequences, which 
was also evident in the iwo dimension plots (data nol shown). 
The same analysis was carried out for the second experi- 
mcnt on ihe NimblcGen arrays. Similarly, exaniinalion of 
Ihe nine orUination plois revealed no obvious relaiionship 
hciween any o( ihc variables and signal intcnsily values 



(data nol shown), In order to more thoroughly investigate 
the rclalionship between itO" tenns and signal miensiiy val- 
ues, the signal intensity values were included as a variable in 
PCA. PCA results of the data from differeni largei sequences 
revealed ihat 83-91% of ihe total matrix variance was 
explained by ihrcc principal axes, with F^l explaining 
36-58%, PC2 explaining 17-29%. and PC3 ftplaining 
14-24% of (he loial matrix variance (Supplementary Tables 
S1~S3), However, Pearson correlation cocfficicnis of ihc vari- 
ables relative lo ihe PC axes revealed inconsisieni resulis for 
the daia from differeni largei sequences. 

To assess hidden nonlinear rclation.ships, ANN analysis was 
used 10 Investigaie Ihe relationship between AG" lerms and 
signal intcnsily values, because neural networks have been 
shown 10 handle noisy, nonlinear data beilcr than conventional 
linear approaches, such as PCA (28). For the.se analyses, ihe 
optimal number of hidden neurons wa.s found lo be 4, when 
a6 ' lerms aic used as inpuls and signal inlensiiy values arc 
u.sod a.s oulpuis, A model of Ihe rclalionship between AG° 
lerms and signal inlensity values was generated by training 
nn ANN using ihc duia from one liirgcl sequence and cross 
validating ihe generaied model by using data from another 
largei sequence 

The correlalion coefhcients between actual and predicicd 
signal intcnsily values of Ihe models are shown in Table 6 A 
correlalion clo.se lo I or - I indicates lhai a model accurately 
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picdicrs signal mlcnsity values when provided with AC" icrms, 
while a correlalion value close lo zero implies no correlatinn. 
We also intluded ihc eorrclalion cof fficienf for each ANN 
Iriimcd wiih ihc .same data since il rcprcsenls ihe 'bcsi' pos- 
sible lorrelsiion for each model Note ihat the 'besi' possible 
eoiTeluiions were based on the analysis of all predicted and 
actual signal mtensiiy value.s in the data from one sequence. 
The reason why Ihc 'best' eorrclaiions were not exactly I or 
- I was because only 80% of Ihc data were u.vcd to train ihe 
ANN model. The remaining (20^ ) of the data were u.scd for 
local testing and validation of the model. 

Poor correlations of ANN predictions to actual values could 
be attributed to over- or under-troinmg of the ANNs, For 
example, an over-trained ANN learns to memorize the training 
data, and con.scqucntly generates high correlations between 
predicted and actual values for data it was trained on. but 
poor or no correlations for test data thai was not used lor 
training We carefully trained each ANN model to generalize 
predictions by opiimi/mg the srehiteclure of Ihc model prior to 
training, and hy slopping training when there was no change in 
Ihc error over a specified period of time or. after a spccihed 
number ol iterations |scc rcf, (26)| This approach ensured that 
each ANN model produced outputs that accurately prcdici 
Mgnal iniensiiy values for AC terms not used for training. 
We conclude lhat the reason the ANN models are unable to 
accurately predict signal intensiiy values when provided with 
data from sequences not used for training, was because there is 
a poor relationship between AG° icrm.s and signal intensity 
values. These findings conroborace the PCA results and suggest 
that no combinations of the AC term.s are major delerrninanis 
lor predicting signal intensity values, 

An 8ssessmen( of the cfTects of mismatches on signal 
Intensity values 

Three-way ANOVA was u,sed to assess the effects of 
MM position. MM type, and the lypc of NN that flank a 
MM, on normalized signal intensity values (see Materials 
and Methods) The model revealed that all three factors had 
low, albeit signilicani effects on the normalized signal intens- 
iiy values (Table 7) Most uf the variance of nomializcd signal 
intensity was explained by MM position (9 6%), followed hy 
MM lypc LVy* ). whereas NN type had the luwesi effect on Ihe 
observed variance among the factors (1,8%) measured by 
partial i)'. In addition, there were interaciions among all com- 
binniions o( iwo factors (Table 7) The strongest interaction 
was observed for MM positions and MM types (3,4%), while 



interactions between MM position and NN type (1,3%) and 
between MM type and NN type ( 1,2%) were comparable, Wc 
were not able to delect significant effects of simultaneous 
interactions among all three factors (Table 7) 

Moving the position of the MM away from the 5' or V 
lerTTiini to the center of the probe signihcanily decreased signal 
inlcnsitics (Figure 3), ANOVA post-hoc contrasi.s between 
means showed that duplexes with MM between positions ft 
and 15 formed a homogenous group (a = 0.05) with the mosi 
pronounced effects on duplex siabiliiy. This finding indicates 
that the most optimal di.scriminaiion of MM from PM duplexes 
is provided with the MM in the middle of (he duplex. How- 
ever, wc emphasize that this was an average result, and note 
that in some individual ca.se.s, MM probes with ceniral mis- 
matches (positions 9-11) were observed to have signal inten- 
sities that were equal or up to 1.6 times higher than that of 
coiTcsponding PM probes, 

A heat map on the effects of the MM lype by position is 
shown on Figure 4, Clearly, there arc diffi-renccs in iivcrafc 
signal inlcnsiiy hy MM lype and ptisition. Post-hoc ANOVA 
contrasts were able to di-vcriminale five homogenous groups 
(Figure two groups with clearly scparaied extremes ii) GA 
and OG mismatches (which destabilize duplcu's the most) and 
(II) TC, TU and TO mismatches (which destabilize duplescs 
the least). Differences in signal iniensiiy values as a function 
ol position art; clearly visible for these two groups in Figure 4 
The.sc findings indicate thai distinguishing PM duplexes from 
ihose containing a single-MM was highly dependent on the 
lype of MM pairs. 

To more fully understand simple patterns of MMs as a 
function of type and position, we pooled MM types lo three 
categories: purine-purine, pyrimidine-pyrimidine and purine- 
pyrimidine MM pairs Figure 6 shows lhat signal intensities of 
duplexes with pyrimidine-pyrimidine MM pairs were more 
similar to PM duplexes than purine-pyrimidine or purine- 
purine MM pairs. An inlcroction was evident at the termini 
of probes where differences in Ihe normalized signal intensi- 
ties among MM pairs were more pronounced towards the 3' 
end of (he probe. Differences in intensity values at the .V and r 
end might be due to ihe orientation of the probe on the micrn- 
array since Ihe 3' end was closest lo the microaniiy surface 
Figure 7 illustrates the effects of ihe type of NN lhat flank j 
MM on normalized signal intensity values, Wc analyzed scpa 
raiely the cases when a MM is located ai the termini ol 
sequence from the cases when il is located elsewhere The 
reason for this is that a MM at the termini could have only one 
neighboring nucleotide, while in all other positions il has two 
neighbors. We assessed the effect of NN by categorizing 
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probes with icrminal MMi Into Iwo siaies; those wiih a purine 
and Ihose wiib a pyrimidinc neighbor. Elsewhere wc groupccl 
MMs having nuclcolides flanking a MM inio three catcgiiriev 
purines only, pyrimidincs only, and purinc-pynmidine com- 
binations In addition 10 the asymmeiric inipael of MM type ai 
the end nf the probe dewribed earlier, we detected asymmetry 
al the ends ol ihe probe conccmliig NN type. Figure 7 ,'ihow.s 
ihal purine neighbor at the V end siabili/xd the duplex more 
than pynmidino (GT3 post-hoc lest. = 0 001), Surprisingly, 
at (he 3' end the opposite trend is true— although 11 is not 
siaiistically significant. When non-lcrminus mismatches 
were considered, the most stabilizing effect on the duplex 
occurred with purine flanking neighbors, Pyrimidinc flanking 
neighbors yielded Ihe lowest duplex stability. Purinc- 
pyrimidine neighbors were in the middle ofthcse twoextremcs 
(Figure 7B, all differences are significant at = 0,001 by GT2 
post-hoc (est) Interactions between NN lypc and MM position 
and type are significant as previously stated. However, due lo 



the minor effects on Ihe variance and peculiar paderns of 
inlcractiDn, we excluded ii from further discussion. 



DISCUSSION 

The thermodynamic propeilies of nucleic acid duplex lormn- 
tion and dissocisiion in .solution have been well establishei) 
(29), For example, the behavior of a probe and a largci 
sequence in solution can be predicted by u.sing a nearest- 
neighbor model (30). However, duplex formation using 
surface- immobilized DNA oligonucleotides is les.s well under- 
stood, presumably due to the complex factors affecting the 
kinetics and thermodynamics of target capture .Some factors 
affecting duplex formation on DNA microarrays include: 
probe density, microarray .surface composition and the stabi- 
lities of oligonucleoiide-targei duplexes, intra- and inter- 
molecular .self- structures and RNA secondary structures 
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1 2 1, 31. .12). Wc rea.soncd chai cxaminmion of (he thcnnody- 
namic siabiliiics of probe-targci duplexes using enisling mod- 
els nnighi provide valuable informwion on che rclaiionships 
beiwecn prcdicled .slaWliijes of targets hybridized to immo- 
bili/.ed probes and iheir corresponding signal inicnsii)' values 
on DNA microarrays. Wc also reasoned that the (josilion and 
lype of MM, and the nature of neighboring ba.ses to the MM 
should nfso affect signal intensity value.?. 

Relationship belwcen (hcrmodynamic predfceions and 
signal inten.sitv values 

Av pi(i|X)scd by Maivtev,! ft al (21 ), hybridi/.uiion.s of a target 
10 probes on a planar microarray are affected by several over- 
lapping proces.ses whith include; (i) the uflinity of a target lo 
bind to a probe (AC ,,), (ii) the formation of stem-loop struc- 
tures ol a probe (AO'"p). (nil the lonnalion of secondary struc- 
ture (loops and helices) of a target (AC,) and (iv) probe to 
probe dimcri/aiion (ACj) (Figure 9). [n addiiion, the overall 



Cibbs Irec energy of binding (AC'ob) can be calculated by 
considering the combined cffcct.s of all four lernis (i e. AC i', 
AG%, AG-'j and AG°,) on hybridization predictions (23) 

A0°, values could have been con.sidered of special relev- 
ance for rRNA, because of the known potential lo fomi enten.s- 
ive secondary .structures. The values were calculated by 
considering the secondary .structure of the targets as determ- 
ined from the LSU rRNA databa.se Two different approaches 
were used in calculate AC, since we did not know if aligned 
(constrained) or not aligned (free form) secondary structure 
signifieamly affecied free energies determinaliun. The aligned 
loldmg preserves the annotated single sirands while the noi 
aligned folding allows the molecule lo reach a conlormyiion 
that corresponds to the calculated global energy minimum 

In our analysis, all Gibb.s free energy terms were poorly 
correlated and linear and nonlinear regressions had low 
/i'--valu<s, to signal intensity values of PM piobe-larcei 
duplexes. Moreover, there docs not appear to be a consisicni 
pattern in Gibbs free energy term,s by target sequence 
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Funhcrmore, while PCA and ANN analyses were able co 
establish significani correlations heiwecn Orbbs free energy 
(crms and signal nncnsily values when each sequence was 
separately analyzed, cross validation using different target 
sequences revealed inconsistent results. These findings indic- 
ate that Gibbs free energy terms and signal intensity values are 
tared dependent and suggest that other factors, such as surface 
density of the probes (31) and/or brush effects (32j, might have 
greater cl'fecls on signal imcnsiiy values than previously 
aniicipaicd. 

Themnodynamic stabiliilcs of twgei RNA hybridized to 
immobilized oligonucleotide probes have been investigated 
in the following studies: (i) Nuef and Magnasco (17) and 
Mei t'l ai (3.T) both described an ad hoc model Chat examined 
the affinity of a probe to a target based on the sum of position- 
depcndenl base-spccific contributions, (li) Zhang ei al (34) 
described an uU hoc model that considered position-dependent 
nearest-neighbor effects, (iii) Held ei al (35) examined the 
effects i)f free energies of RNA/DNA duplex formation and 
(IV) Wu and Iri^arry (361 developed a model thai considered 
hmh siiichiHiic and deterministic aspects ofprobc-iargct hybri- 
di/-aiions The unifying Icaliircs of these studies arc: (i) they 
arc all based iin the analysis of multiple probci targeting 
niRNA transcripts (i c expression data), (n) with exception 
of Held CI 1)1 (35). they only considered single-base pair 
mismuichcs that occurred in the middle of the duplex (position 
1 3 of 35mers). (iii) ihcy assumed that binding of vanous RNA 
target* was independent and noneompciilive. Unfonunatcly, 
none of the studies satisfactorily predicted signal intensity 
values on oligonucleotide microarrays since there were signi- 
ficant disagreements between actual and predicted values. 

The etTect of sinRle-base pair mismatches on iJuplex 
signal intensity values 

In solution, .singlc-busc-mismatches in oligonucleotide probes 
can stabilize or dcsiubilize a duplex depending on the identity 
of the MM. its position in the helix and its neighboring base 
pairs (37) Although it has been establi.shed that there are 
ililTcrciiccs in experimental results conducted in solution ver- 
sus those using microarrays (17), we investigated MM type 
and po.sliion, and neighboring ba.sc pairs on .signal intensity 
values because the effects of these variables on planar mlcro- 
aiTays are not well understood. 

We found that the position of the MM affected duplex 
stability (as inferred by signal inten.sity values) This finding 
is consistent with previous studies (16) showing that terminal 
mismatches are less destabilizing than iiitcmal ones. We also 
found asymmetry m the pattern of signal intensity values by 
position Specifically, nomialized signal intensiiies among 
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MM pairs were more pronounced towards the 3' end nf Ihc 
probe This phenomenon was presumably due to orientation ol 
Ihc probe on the microarray since the 3' end was icihcroci to the 
microairay surface Since clcctrostuiic effects of the micro- 
array surface are distance dependent, misnialchcs closest m 
Ihc 3' end might be responsible lor the observed cfleci 
(38)— although further studies arc needed to verify this. 

Studies conducted in solution have shown that different 
MM types cause diverse effects on duplex stability (39), 
We found that the order of stabilities of MM pairs in solution 
were different from that ob.scrvcd in microarrays (Figure (i). In 
genera), the microarray results revealed that pyrimidinc 
pyrimidine MM pairs were more stable (left side of 
Figure 8) than purine-purine MM pairs (right side of 
Figure 8). This result wa,s aniictpated since purines are com- 
posed of large double-ringed nucleotides that distort the geo- 
metry of the double helix— incurnng a large steric and 
stacking cost. Hence, MM pairs containing purine destabilize 
the duplex and have lower signal inten.sity values than it.s 
corresponding PM duplex. Pyrimidine-pyrimidine mis- 
matches, on the other hand, arc composed nf small single 
rings thai do not distort the geometry of the double helm, 
resulting in higher stabilities and signal intensity values 
than MM pairs containing one or two purines. Possible reasons 
for the discrepancy in the order of stabilities in solution versu.s 
those In microarrays include' the number of samples examined 
ISugimoto ei at. (39) versus this study, 52 versus 10 440 MM 
pairs, respectively), the size of ihe oligonucleotide probes 
on the microairays (9mcrs versus 20mers. respectively), and 
neighboring ba.sos employed (C MM-G. G-MM-C, C-MM-C, 
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G-MM-G versus every possible neighboring combination, 
respeclively) 

Iniercsiingly, we also found some asymmelrics in signal 
iniensily values of mismaiches lhai coniain Ihe same pair of 
buses (eg, GA and AG; Figures 4 and 5) but differ only in ihc 
sense lhal MM nucleoiicle is eilher on Ihe probe or largei strand 
of (he duplex, Sugimoio ei al. (,19) also found this asymmeuy 
for mismaiches occurring in short oligonucleoiides in solulion. 
This effcel can currcnily not be esplaincd. 

The bases neighboring a probe MM can also significanily 
affcci signal inicnsiiy values. Bases neighboring a MM ai ihe 
5'-ierminus had conirosling atfccis on signal inicnsiiy values 
10 (hose at ihe .T-ierminus, For example, al Ihe 5'-ienrtiinus, 
purine neighbors had higher signal miensily values (han pyr- 
imidine neighbors, while ai Ihe .T-(enninus. purine neighbors 
had the opposite elfecis on signal iniensily values (Figure 7), 
These differences may be due to sieric effccis of MMs al ihe 
3'-lerminus, which are close lo ihc microauay surface. In 
eonirasi (o bases neighboring a MM at ihc lerminus, bases 
neighboring an inicmal MM yielded a consistent irend: mis- 
matches Hanked by purine neighbors had a more stabilizing 
effect on duplexes (han oihcr combinations. These findings are 
consis[enl wiih Sugimoio ei i/l. (39), which showed (hat both 
(he MM lype and the neighboring bases of ihe probe influenced 
duplex siabiliiy 

CONCLUSION 

In summary, (here is liiile evidence lo support the notion ihat 
thermodynamic parameters accuralcly preiJici signal intensity 
values of duplexes with rRNAs on oligonucleotide (20-2.^ nO 
DNA microamiys. As a conso<)uence, we recommend lhal 
thermodynamic criteria (e.g. 21, 40) not be used for designing 
oligonucleotide probes lor species idcntilicaiion— instead, an 
empirical veriftcalion of each probe is advised lo oblain Ihe 
best signal inlensilics. Thorough empirical calibration of 
mieroarrays ha,s reccrily been shown to be u.seful in a related 
held (mcthyladon paiiem analysis via microarray-bascd 
genotyping, (4I)| lo select best probes within one or iwo 
oplimi,ialion and sclcclion cycles. With respect lo MM cffccls, 
wc find thai the position and type of single-ba,se pair MM and 
compohiiion of neighboring bases affected the stability of 
duplexes on DNA microarriiys— bui in different ways from 
whai IS known from eApcrimcnis conducted in solution 
Key differences are: (i) positional affects of MMs were 
aiymrnelric, presumably due lo sicric affects of mismatches 
close 10 the surface nf ihe microarray, (ii) pyrimidinc- 
pyrimidinc MM pairs were more stable than purine-purine 
MM pairs and (in) duplexes with mismatches flanked by 
purine neighbors were more stable than other combinations 
of neighbors, However, we point out lhal even these effects, 
although consistent, have only a partial predictive value, 

SUPPLEMENTARY DATA 

Supplemenlary Data arc available al NAR Online. 
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Exhibit 5: ClustalW2 pairwise sequence alignment results 
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To save a result file right-click ths file linl< in the above table and choose "Save Target As'\ 

If you cannot see the JalView button, reload the page and check your browser settings to enable Java Apple 
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SeqA Name Len(nt) SeqB Name Len(nt) Score 
1 Seql 10 2 Seq2 10 90 

PLEASE NOTE: Some scores may be missing from the above table if the alignment was done using multipk 
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CLUSTAL 2.0.10 multiple sequence alignment 
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Cancers arise from the accumulation of multiple mutations in genes regu- 
lating eellu/ar growth and differentiation. Identification of such mutations 
in numerous genes represents a significant challenge Ln genetic analysis, 
particularly when the majority of DNA in a tumor sample is from wild- 
type stroma. To overcome these difficulties, we have developed a new 
type of DNA microchip that combines polymerase chain reaction/hgase 
detection reaction (PCR/LDR) with "zip-code" hybridization. Suitably 
designed allele-specific LDR primers become covalently ligated to adja- 
cent fluorescently labeled primers if and only if a mutation is present. 
The allele-specific LDR primers contain on their 5' -ends "zip-code com- 
plements" that are used to direct LDR products to specific zip-code 
addresses attached covalently to a three-dimensional gel-matrix array. 
Since zip-codes have no homology to either the target sequence or to 
other sequences in the genome, false signals due to mismatch hybridiz- 
ations are not detected. The zip-code sequences remain constant and 
their complements can be appended to any set of LDR primers, making 
our zip-code arrays universal, Usmg the K-ras gene as a model system, 
multiplex PCR/LDR followed by hybridization to prototype 3x3 zip- 
code arrays correctly identified all mutations in tumor and cell line DNA 
Mutations present at less than one per cent of the wild-type DNA level 
could be distinguished. Universal arrays may be used to rapidly detect 
low abundance mutations in any gene of interest. 

© 1999 Academic Press 
Keywords: zip-code addressing; DNA hybridization; thermostable DNA 
ligase; ligase detection reaction; single nucleotide polymorphism (SNP) 



Introduction 

Cancers arise from the accumulation of 
mutations in genes controling the cell cycle, apop- 
tosis, and genome integrity. These mutations may 
be inherited or somatic, arising from exposure to 
environmental factors or from malfunctions in 
DNA replication and repair machinery (Fearon, 
1997; Fearon it Vogelstein, 1990; Liu et al, 1996; 
Perera, 1997), Oncogenes may be activated by 
point mutattons, translocation, or gene amplifica- 
tion, while tumor suppressor genes may be inacti- 
vated by point mutations, frameshift mutations 



Abbreviations used- UDR, ligase detection reaction; 
FAM, 6-cartio)iyfluorescein; Mes, 2-(/V-morpholino) 
ethanesulfonic acid, SNP, single nucleotide 
polymorphism. 

E-mail address of corresponding author; 
barany@mdil.ined. Cornell edu 



and deletions (Bishop, 1991; Da Costa el al , 1996; 
Venitt, 1996), A major hurdle to detecting 
mutations in these genes is that, in primary 
tumors, normal stromal cell contamination can be 
as high as 70% of total cells, and thus a mutation 
present in only one of the two chromosomes of a 
tumor cell may represent as little as 15% of the 
DNA sequence present in a sample for that gene. 
Thus, there is an urgent need to develop technol- 
ogy that can identify accurately one or more low 
abundance mutations, at multiple adjacent, nearby, 
and distal loci in a large number of genes 

The advent of DNA arrays has resulted in a 
paradigm shift in detecting sequence variations 
and monitoring gene expression levels on a geno- 
mic scale (Beattie et al , 1995; Brown &. Botstein, 
1999; Chee et al., 1996; Cronin et ai, 1996; DeRisi 
el ai, 1996; Drobyshev et ai, 1997; Eggers el ai, 
1994; Gunderson et ai, 1998; Cuo el ai, 1994; 
Hacia, 1999; Hacia et ai, 1996; Kozal el ai, 1996; 
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Pease tl ai, 1994; Schena et ul., 1996; Shalon el al, 
1996; Southern et al, 1999; Yershov et ai, 1996; Zhu 
el (?/., 1998). DNA chips designed to distinguish 
single nucleotide differences are generally based 
on hybridization of labeled targets (Beattie el al., 
1995; Chee el al., 1996; Cronin et ul, 1996; 
Drobyshev et al, 1997; Eggers et al, 1994; Guo et al, 
1994; Hacia et ul, 1996; Kozal et al, 1996, Parinov 
et al., 1996; Sapjisky et al, 1999; Wang et al, 1998; 
Yershov et al, 1996) or polymerase extension of 
arrayed primers (Lockley et al, 1997; Nikiforov 
el al, 1994; Pastinen et al, 1997, Shumaker et al, 
1996), While DNA chips based on these two 
formats can confirm a known sequence, the simi- 
larities in hybridization profiles create ambiguities 
in distinguishing heterozygous from homozygous 
alleles (Beattie et al, 1995; Chee el al, 1996, Eggers 
et al . 1994; Kozal et al, 1996; Southern, 1996; Wang 
el al , 1998). To overcome this problem, several 
methods have been proposed, including the use of: 
(0 two-color fluorescence analysis (Hacia et al, 
1996, 1998a); (li) a tiling strategy that uses 40 over- 
lapping addresses for each known polymorphism 
(Cronin et al, 1996), (iii) incorporation of nucleo- 
tide analogues in the array sequence (Guo et al, 
1997; Hacia et al . 1998b); ancl (iv) adjacent co- 
hybridized oligonucleotides (Drobyshev et al, 
1997; Centalen k Chee, 1999, Yershov ef al, 1996). 
A recent side-by-side comparison revealed that the 
use of hybridization chips for nucleotide discrimi- 
nation gave an order of magnihide higher back- 
ground than was observed with the primer 
extension approach, resulting in an increased likeli- 
hood of false positive identifications (Pastinen et al, 
1997). Nevertheless, solid-phase primer extension 
can also generate false positive signals from mono- 
nucleotide repeat sequences, template-dependent 
errors, and template-independent errors (Nikiforov 
et al. 1994; Shumaker et al , 1996), In addition, 
neither of these two types pf arrays can detect 
cancer mutations when these are present in a 
minority of the total target DNA, 

Over the past few years, our laboratories have 
pursued an alternate shrategy in DNA array 
design, In concert with polymerase chain reaction/ 
ligase detection reaction (PCR/LDR) assays carried 
out in solution (Barany, 1991a,b; Belgrader et al, 
1996; Day et al , 1995, 1996; Khanna el al, 1999), 
our array concept allows for accurate identification 
of mutations and single nucleotide polymorphisms 
(SNPs), Primary PCR amplification of the gene of 
interest is followed by LDR, which uses a thermo- 
stable Tth DNA ligase that links two adjacent 
oligonucleotides annealed to a complementary tar- 
get if and only if the nucleotides are perfectly base- 
paired at the junction (Figure 1(a)). Since a single- 
base mismatch prevents ligation, it is possible to 
distinguish mutations with exquisite specificity, 
even at low abundance (Khanna el al, 1999). Fur- 
thermore, such assays are ideal for multiplexing, 
since several primer sets can ligate along a gene 
without the interference encountered in polymer- 
ase-based assays (Belgrader et al, 1996; Day et al. 



1995; Khanna et al, 1999), High-throughput detec- 
tion of specific multiplexed LDR products is then 
achieved vie divergent sequences termed "zip- 
code" complements which guide each LDR pro- 
duct to a ctesignated zip-code address on a DNA 
array (Figure 1(b)), This concept is analogous to 
molecular tags developed for bacterial and yeast 
genetics (Hensel et al., 1995; Shoemaker et al, 
1996), Based on recent multiplexed PCR/LDR 
results from our laboratory, the new approach 
should allow detection of: (i) dozens to hundreds 
of polymorphisms in a single-tube multiplex for- 
mat; (ii) small insertions and deletions in repeat 
sequences; and (iii) low abundance mutations in a 
background of nonmal DNA (Khanna et al, 1999, 
and unpublished results). 

Results and Discussion 

zip-code concept and design 

Our approach uses microarrays of unique 24- 
base oligonucleotides that are coupled to a three- 
dimensional polymer at known locations. These 
24-mers or zip-codes (Table 1) hybridize specifi- 
cally to molecules containing sequences that are 
complementary to the zip-codes. By linking the 
zip-code complements to fluorescent primers via a 
tandem PCR/LDR strategy, zip-code microarrays 
can be used to assess the presence and abundance 
of mutations in biological specimens. Importantly, 
because the zip-codes represent unique artificial 
sequences, zip-code microarrays can be used as a 
universal platform for molecular recognition 
simply by changing the gene-specific sequences 
linked to the zip-code complements. 

Each zip-code sequence is composed of six tetra- 
mers (designed as described below) such that the 
full-length 24-mers have similar („ values. The 256 
(4*) possible combinations in which the four bases 
can be arranged as tetramers were reduced to a set 
of 36; these were chosen such that each tetramer 
differed from all others by at least two bases 
(Figure 2), Tetramer complements, as well as tetra- 
mers that would result in self-pairing or hairpin 
formation of the zip-codes, were eliminated. Fur- 
thermore, tetramers that were palindromic, eg. 
TCGA, or repetitive, eg. CACA, were excluded 
(diagonally hatched boxes in Figure 2) The indi- 
cated set of 36 tetramers represents just one of the 
possible sets that can be created; alternative sets 
can be developed by starting in any of the unused 
light gray boxes (Figure 2). 

Six tetramers were chosen from the larger set of 
36 for use in designing the zip-codes for the proto- 
type array. These six tetramers were combmed 
such that each zip-code differs from all others by 
at least three alternating tetramer units (Table 1), 
This ensures that each zip-code differs from all 
other zip-codes by at least six bases, thus prevent- 
ing even the closest zip-code sequences from cross- 
hybridizing. The t,„ values of correct hybndizations 
range from 70'-C to 82 "C and are at least 24 deg C 
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Figure 1. Scheme for PCR/LDK delection of mutations using an addressable array, (a) Schematic representation of 
LDR pnniers used to distinguish mutations. Each allel^specific primer contaUis an addressable sequence complemeni 
cZI or cZ3) on the 5 -end and the dlscriminatuig base on the 3'-end. The common LDR primer is phosptiorylated on 
rll!^^^t 1 'T^'^u "T^.^"' ^'-^'^ '^^ P""^^"^ hybridize adjacent to eadi oSier on target 

DNA, and the nick will be sealed by the Ugase it and only if there is perfect complementarity at the junction, Cb) The 
presence and type of mutation is delemiined by hybridising the contents of an LDR to an addressable DNA array 
The zip-code sequences are designed to be sufficiently different, so that only primers conlaminR the correct 
complement to a given zip^ode will remain bound at that address, (c) Schematic representation of chromosomal 
DNA containing the K-ras gene. Exons are shaded and the positions of codons 12 and 13 are shown. Exon-specific 
pnmers were used to selectively amplify K-ras DNA flanking codons 12 and 13 Primers were destened /or LDR 
detection of seven possible mutations m these rwo codons as described in (a). 



higher than that of any incorrect hybridization 
(calculated using Oligo 6,0, Molecular Biology 
Insights. Inc, Cascade, CO). The concept of using 



alternating rows and colunws of tetrarner units 
may be extended to include all 36 tetramers, hence 
creating an array with 1296 divergeni addresses. 
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Table 1. Zip-code sec)uences used in prolotype array 
Zip*) Tetramer yrder* 



Zip) 


1-6-3-2-6-3 


Zlp3 


3-6- 5-2-2-3 


Zip5 


5-6-1-2-4-3 


Zipl 1 


1-4-3-6-6-1 


Zipl3 


3-t-5-6-2-) 


2ipl5 


5^-l-6-(-) 


?ip2) 


1-2-3-4-6-5 


Zip23 


3-2-5-4-2-5 


Zip25 


5-2-1-4-4-5 



2ip-cpde se<^uence (5' -* 3')'' 



TGCG-ACa'-CACC-ATCG-ACCT-CAGC-spacer-NH, 
CACCACCT-CACC-ATCG-ATCG-CACC-jpacer-NH, 
CACC-ACCT-TCCG-ATCG-CGTA-CAGC-spacer-NHj 
TCCC-GGTA-CAGC-ACCT-ACCT-TGCG-spac-er-NH, 
CAGC'CCTA-CACC-ACCT-ATCG-TGCC-spacerNH, 
GACC-GGTA-TCCGACCT-GGTA-TGCG-spacer-NHj 
TGCC-ATCC-CACC-CCTA-ACCT-CACC-spacer-NH, 
CAGC-ATCC-CACC-GGTA-ATCG-CACC-spacer-NH, 
GACC-ATCC-TGCG-GGTA-CGTA-GACC-spacer-NH, 



' Order of tetramer oligonucleotide segments in the corresponding zip-code sequence Six tetramers were 
chosen from Ihe full set of 36 to prepare the zip-codes for the prototype array The m tetramers which were 
renumbered for ease of use are: 1, TGCG, 2, ATCC, 3, CAGC; 4, GGTA, 5, CACC, and 5, ACCT. Closely related 
sequences, (Zlpl, 3, 5), (Zipll, 13, 15) and {Zip2l, 23, 25) dijfer at the first, third, and fifth tetramer posihons, 
put are identical at the second, fourth, and sixifi tetramer positions. 

' sp.cer-NH; =. -0(PO;)0-(CH,CH;0)t-l>O;-O(CHt)aNH;, 



Array preparation 

Numerous types of two and three-dimensional 
matrices vi'ere examined with respect to: (i) ease of 
preparation of the surface, (ii) oligonucleotide load- 
ing capacity; (iii) stability to conditions required 
for coupling of oligonucleotides, as well as for 
hybridization and washing; and (iv) compatibility 
with Ouorescence detection. Our currently favored 
methodology to construct zip-code arrays involves 
initial creation of a lightly crosslinked film of acryl- 
amide/acrylic acid copolymer on a glass solid 
support; subsec]uently, the free carboxyl groups 
dispersed randomly throughout the polymeric sur- 
face are activated with N-hydroxysuccinimide, and 
amine terminated zip-code oligonucleotide probes 
are added to form covalent amide linkages 
(Figure 3(a)). The described coupling chemistry is 
rapid, straightforward, efficient, and amenable to 
both manual and robotic spothng. Both the acti- 
vated surfaces and the surfaces with attached 
oligonucleotides are stable to long-term storage. 



Optlmlratfon of hybridization conditions 

Hybridizations of a fluorescently labeled 70-mer 
probe onto model zip-code arrays were measured 
as a function of buffer, metal cofactors, volume, 
pH, time, and the mechanics of mixing (Table 2). 
Even with closely related zip-codes, cross-hybridiz- 
ation was negligible or non-existent, with a signal- 
to-noise ratio of at least 50:1. Our experiments 
suggest that different zip-codes hybridize at 
approximately the same rate, i e. the level of fluor- 
escent signal is relatively uniform when normal- 
ized for the amount of oligonucleotide coupled per 
address (data not shown). Magnesium ion was 
obligatory to achieve hybridization, and less than 
1 fmol of probe could be detected in the presence 
of this divalent cation (Table 2 and Figure 4). The 
hybridization signal was doubled upon lowering 
the pH from 8,0 to 6.0, most likely due to masking 
of negative charges (hence reducing repulsive 
interactions with oligonucleotides) arising from 
uncoupled acrylic acid groups in the bulk polymer 



Table 2. Effect of hybridization conditions on hybridlzalion signal 



Hybndization buffer 



Buffer A 

Buffer A mmus MgC)j 

Buffer A 

Buffer B 

Bu/fer B 

Buffer B 

Buffer B 

Buffer A + Capped Surface 
Buffer B minus tvigCI, 
Buffer 5 



Vol. (Ill) 


Mixing' 


Time (minutes) 


Relative signal 


55 


Inter 


30 


1 


55 


Inter. 


30 


<001 


20 


Inter 


30 


2.5 


55 


Inter 


30 


2 


20 


Inter 


30 


3 


55 


Conl 


30 


4 


55 


Coni 


60 


e 


55 


Com 


60 


8 


55 


Com 


60 


<aoi 


55 


Com 


180 


10 



Followmg general procedures described in Materials and Metjiods, hybndiiations were earned out with 1 pmol of FAMcZipl3-Prd 

Met 6™To mM MgCI,7l % S^"' ""^ ^' ™ """^ 'f"" ^ '° ""^ " ' ^' ^ '"'^ 

■ MUing was as follows: mtermittent (Inter ), manual mixing of the sample once every ten minutes; contmuous (Cont.) mixing of 
sample at 20 rpm in a hybndualion oven 
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A iTJ ^ lelramers for use m zip^ode arrays. The checkerboard pattern shows aU 256 possible tetramers 

Im Xh^ k' ^'.^ ^'."^ ^""""^'^ 0" 'he top of the checkerboardTo be 

mcluded, each letramer must d.ffer from al) others by at least two bases, and be non complementary T>,e chosen 
tetramers are shown m the wh.te boxes, while their complements are listed as (number)'. T^us, as an^exar^ple ^ 
cornplementa.7 sequences G A CC (20) and GCTC (20 are mutually exclusive In this scheme. additionTe^ame s 
hat are palindromic, e.g. TCGA (off-diagonal hatched boxes) or repetitive, e.g. CACA (hatched boxes on diaconll 
from upper left to lower right) have been eliminated. All other sequences which differ from the 36 te^^mers by^oX 
one base are .shaded ,n light gray. Four potential tetramers were not chosen as they are either al! A T or G C ba«s 



matrix. To confirm this hypothesis, the free car- 
boxyl groups on arrays to which zip<ocies had 
already been attached were capped with ethanol- 
amine under standard coupling conditions. 
Hybridizations of the capped arrays at pH 8.0 gave 
results comparable to hybridizations at pH 6.0 of 
the same arrays without capping. Continuous mix- 
ing proved to be crucial for obtaining good hybrid- 
isation, and studies of the time-course led us to 
choose one hour at 65 °C as standard. Reducing 
the hybridization volume improved the hybridiz- 
ation signal due to the relative increase in probe 
concentration. Further improvements may be 
achieved using specialized small volume hybndiz- 
ation chambers that allow for continuous mixing. 

Array hybridization of K-ras LDR products 

PCR/LDR amplification coupled with zip-code 
detection on an addressable array was tested with 
the K-ras gene as a model system, Exon-specific 
PCR primers were used to selectively amplify 



K-ras DNA flanking codons 12 and 13, LDR 
primers were designed to detect the seven most 
common mutations found in the K-ras gene in 
colorectal cancer (Figure 1(c) and Table 3), For 
example, the second position in codon 12, GGT, 
coding for glycine, may mutate to CAT, coding for 
aspartate, which is detected by ligation of the 
allele-specific primer (containing a zip-code comp- 
lement, cZip3, on its 5'-end, and a discriminating 
base. A, on its 3'-end) to a fluorescently labeled 
common primer (Figure 1(c)). 

PCR/LDR was carried out on nine individual 
DNA samples derived from cell lines or paraffin- 
embedded hjmors containing known K-ras 
mutations (as described in Materials and Methods) 
An aliquot (2 nl) was taken from each reaction and 
electrophoresed on a sec)uencmg apparahjs to con- 
firm that LDR was successful (data not shown). 
Next, the different mutations were distinguished 
by hybridizing the LDR product mixtures on 3 x 3 
addressable DNA arrays (each zip-code address 
was spotted in quadruplicate), and detecting the 
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Figure 3, Detection of K-ras mutations on a DNA array, (a) Schematic representation of gel-based zip-code array 
Glass microscope slides treated with r-methacryloyloxypropyltrimethoxystlane are used as the substrate for the 
covaleni attachment of an acrylamide/acrylic acid copolymer ma(ri;<. Amme-modified zip-code oligonucleotides ate 
TZl ^ '° ^-hydro,ysuccm.mide-act.vated surfaces at discrete locations (see Materials and MeU^odsT Eaci? pos tfon 
h, F, V V,'"'"'"!? ''P-"^^ •"""■^'^ "'responding K-ras mutation or wild- ype sequence) 

descnbi^ n Ma^eriL/^d M ^^^""^'^^ "J"^ signaTdet^ted as 

described m Materials and Methods usmg a two second exposure time. All nine arrays identified the correct mutant 
and/or wild-type lor each twnor (CUS, GI2R, and C12C) or cell line sample (Wt, C)2D, G12A G12V and CUD 

l^r.Z'XT''. '"k'""' °^ P'""*' °' P"™! contau>ing the G13D mutant are not 

incorrect hybridizations, but noise due to imperfections in the polymer. 



positions of fluorescent spots (Figiire 3(b)). The 
wild-type samples, Wt(G]2) and Wt(G13), each 
displayed four equal hybridization signals at Zipl 
and Zlp25, respectively, as expected. The mutant 
samples each displayed hybiidization signals cor- 
responding to the mutant, as well as for the wild- 
type DNA present in the cell line or tumor. The 
sole exception to this was the G12V sample, which 



was derived from a cell line (SW620) homozygous 
for the G12V K-ras allele. The experiment was 
repeated several times, using both manually and 
robotically spotted arrays, and LDR primers 
latieled with either fluorescein or Texas Red. False- 
positive or false-negative signals were not encoun- 
tered in any of these experiments. A minimal 
amount of noise seen on the arrays can be attribu- 
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Fluorlmager Analysis 



MicroKope/CCD Analyiis 




Pmol of FluortKcin-Ljibclcd 70-nier Complement toZipl3 
Mixed with 9000 fmol of LDR Primcn 

Figure 1 Determination of zip-code array capture sensilivity using two different detection instruments Ouadruoli- 
cate hybndizations were earned ou: on manually spotted arrays as described in Materials and Memc^s TV erTohs 
depict quantificahon of the amount of captured 70-mer complement using either a OuorlmagerfW?) u7 n^p'ffuor^es 

eJl .Zr'T'^^^ ^'Tl K^'t ^'P'"^"'" hybndizations to an ^dividual ar?ay, Ve " Ue^Care on 

each graph li the average of the backgrounds from all fouj arrays. ^ 
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Tsble 3. Primers designed for K-rss mutation detection by PCR/LDR/array liybndization 
Pnmer Sequence (5' 3') 



K-rflS exon 1 forward 
K-ras exon 1 reverse 

cZipl-K rns ct2 2WtC 
cZip3-K-™s cl2 2D 
cZipS-K-rfls CI2.2A 
cZipll-K-ros t:12-2V 
K-raj cl2 Com-2 

cZipl3-K ros cl2 IS 
c2ipl5-K-™s cl2 )R 
cZip2l-K-i-os cl2 IC 
K'rtfi cl2 Com-l 

cZip23-K ™j C13 4D 
cZip25-K-ra5 cl3 4WtG 
K-ms cl3 Com-4 



ATAACCCCTCCTCAAAATGACTCAA 
CTGCACCAGTAATATGCATATTAAAACAAC 

GCTGAGCTCGATGCTCAGGTCGCAAAACTTCTGCTAGTTGGACCTGG 

GCTGCGATCGATGGTCAGCTGCTCAAACTTGTCCTAGTrCGAGCTCA 

CCTGTACCCCATCGCAAGGTGCTCAAACTTCTCCTACTTGGAGCTGC 

CCCAAGGTAGCTCCTCTACCCCCAAAACTrCTCGTACTTGGACCTGT 

pTCCCCTAGGCAACAGTCCCI-nuorejcein 

pTGGCGTAGGCAACAGTGCCT -Texas Red 

CGCACCATAGCTGGTCTACCGCTGATATAAACTTCTGCrAGTrGCAGCTA 
CGCATACCAGGTCGCATACCGGTCATATAAACrrCTCCTACTTCGAGCTC 
CCTCACGTTACCGCTGCGATCCCAATATAAACnCTGCTAGTTGGAGCfT 
pGrGGCGTAGGCAACAGTCCC-fluorescem 
pGTGGCG-rAGGCAAGACTGCC-Tesas Red 

GGTCCCATTACCGGTCCGATGCrCTGTGGTACTTGGAGCTGCTCA 
GGTCTACCTACCCGCACGATGGTCTGTCGTACTTGGAGCTGGTGC 
pCGTAGGCAAGAGTGCCTTCAC-nuorescem 
pCGTAGGCAAGAGTGCCTTGAC-Texas Red 



TTie PGR pnmcrs were speci/icaUy designed to amplify exon 1 of K-rsj witliout co-ampli/ying N and H-ras, 
The allele-specific UDR pnmers contained 24-nier lip-code complement sequences on [heir 5' -ends (bold) and 
Ihe discrirnmatmg bases on their 3' -ends (underlined). The common LDR primers contained 5'-pt\osphate groups 
and eiltier a fluorescein or a Texas Red label on tlieir 3' -ends. 



led to dust, scratches, and/or small bubbles in the 
polymer. TTiese flaws are readily recognized 
because they are weak and sporadic, rather than 
reproducing the quadruplicate spotting pattern; we 
expect such noise will be minimized with more 
stringent manufacturing conditions. Ultimately, 
these protocols are amenable to quantifying the 
relative amounts of each allele, and work is 
currently in progress to convert our quantitative 
PCR/LDR protocols for K-ros mutations from 
gel-based detection to array-based detection 
(unpublished results). 

Array capture sensitivity 

After an LDR, the successfully ligated and fJuor- 
escently labeled LDR product competes with an 
excess of unligated discriminating primer for 
hybtidizahon to the correct zip-code address 
on the array To determine capture sensitivity, 
DNA arrays were hybridized in quadruplicate, 
under standard conditions, with from 100 amol 
{= 1/90,000) to 30 (= 1/300) fmol of a labeled syn- 
thetic 70-mer, FAMcZipl3-Prd (this simulates a full- 
length LDR product; see Materials and Methods for 
the sequence), in the presence of a full set of K-rss 
LDR primers (combined total of 9000 frnol of discri- 
minatjng and common pnmers). Array analyses 
with a Fluorlmager (Figure 4, left-side) Indicate that 
a signal-to-noise ratio of greater than 3:1 can be 
achieved when startwg with a mirumum of 3 fmol 
(= 1/3,000) of FAMcZipl3-Prd-labeled probe in the 
presence of 4500 fmol of FAM-labeled LDR primers 
and 4500 fmol of zip-code complement primers in 
the hybridization solution. Results using micro- 
scope/CCD instrumentation to quantify fluor- 
escence were even more striking; a 3:1 signal-to- 
noise raho was maintained starting with 1 fmol 
(= 1/9,000) of labeled product (Figure 4, right-hand 



side) on three out of the four anays; the signal to 
noise was 2:1 on tfie fourth array. For a given array, 
with fluorescence quantified by either instrument, 
the captured counts varied linearly with the amount 
of labeled FAMcZipl3-Prd added. Rehybridization 
of the same probe, at the same concentration, to the 
same array, was reproducible within ±5% (data not 
shown). Variations in fluorescent signal between 
arrays may reflect variations in the amount of zip- 
code oligonucleotide coupled, due to ihe inherent 
inaccuracies of manual spotting and /or variations 
in polymer uniformity, 

Detection of low abundance mutations by 
PCR/LDR/array hybridization 

To determine the limit of detecHon of low-level 
mutations in wild-type DNA using PCR/LDR/ 
array hybridization, a dilution series was set up 
and analyzed. PCR-amplified pure G12V DNA 
was diluted into wild-type K-ras DNA in ratios 
ranging from 1:20 to 1:500, Duplicate LDRs were 
carried out on 2000 fmol of total DNA, using a 
hvo-primer set consisting of 20O0 fmol each of the 
discriminahng and common pnmers for the G12V 
mutation. It proved possible to quanhfy a positive 
hybridization signal at a dilution of 1,200 with a 
signal-to-noise raho of 2:1 (Figure 5) A signal was 
distinguishable even at a dilution of 1500, 
although noise levels due to dust or bubbles in the 
polymer prevented us from accurately quantifying 
the results, A control of pure wild-type DNA 
showed no hybridization signal. These results indi- 
cate cleariy that zip-code array hybridization, 
when coupled with PCR/LDR, may detect poly- 
morphisms present at less than 1 % of the total 
DNA. These results are consistent with our earlier 
work showing that PCR/LDR, using a 26-primer 
set and analyses based on gel electrophoreses of 
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Hw\ 0 1 ; V M umm PISA Templute Mucd 
WKh JOOO fmol Wild Typo DNA Templaw 

Figure 5. Dfteclion of minority K-ros mutant DNA in 
a m,i)orily of wiJd-type DNA using PCR/UDR with zip- 
code array captuxe. DNA from cell line SW620, 
containing Ihe G12V mutation, and DNA from normal 
lympliocyies were PCR amplified In exon 1 of Ihe K-rus 
gene Mixtures containing 10, 20, 40, or 100 fmol of 
G12V-amplifled fragment plus 2000 fmol of PCR-amplj- 
fied wild-type fragment were prepared, and the pre- 
sence of mutant DNA determined by LDR using 
primers specific for the G12V mutation (2000 fmol each 
of discriminating and common pnmer). Images were 
collected by CCD using exposure times from five to 
25 seconds, Data were normalized by dividing fluor- 
escenl signal intensity by acquiailion lime. Each data 
point represents the average hybndization signal from 
four independent robotically spotted arrays The average 
background signal from all four spots at each address 
following hybndization of pure wild-type control (880 
average fluoresceni counis) was subtracted from the 
mulani signal. 



products, can detect any K-ras mutation in the pre- 
sence of up lo a 500-foid excess of wild-type, with 
a signal-to-noise ratio of at least 3:1 (Kharma ef al 
1999). 

Comparison of universal array to 
gene-specific arrays 

Our approach to mutation detection has three 
orthogonal components: (i) pritnary PCR amplifica- 
tion; (ii) soluhon-phase LDR detection; and (iii) 
solid-phase hybridization caphjre. Therefore, back- 
ground signal from each step can be minimized 
and, consei^uently, the overall sensitivity and accu- 
racy of our method are significantly enhanced over 
those provided by other strategies. For example, 
hybridization of labeled target methods require: (I) 
multiple rounds of PCR or PCR/T7 h-anscription; 

(ii) processing of PCR amplified products to frag- 
ment them or render them single-stranded; and 

(iii) lengthy hybridization periods (ten hours or 
more) which limits throughput (Cheq ct <?/., 1996; 
Cronin et al, 1996; Guo cf ai., 1994; Hacia et al., 
1996; Schena f( ai, 1996; Shalon el al, 1996; Wang 
ei al., 1998), Additionally, since the immobilized 
probes on the aforementioned arrays have a wide 
range of values, it is necessary to perform the 



hybridizations at temperahjres from 0"C to 44 "C, 
The result is increased background noise and false 
signals due to mismatch hybridization and non- 
specific binding, for example, on small insertions 
and deletions in repeat sequences (Cronin et ai., 
1996; Hacia et al, 1996; Southern, 1996; Wangff at , 

1998) , In contrast, our approach allows multiplexed 
PCR in a single reaction (Belgrader et al, 1996), 
does not require an additional step to convert pro- 
duct into single-stranded form, and can readily dis- 
tinguish all point mutations including slippage in 
repeat sequences (Day et al, 1995; Khanna et al , 

1999) , Alternative DNA arrays suffer from differen- 
tial hybridization efficiencies due to either 
sequence variation or to the amount of target pre- 
sent in the sample. By using our approach of 
designing divergent zip-code sequences with simi- 
lar thermodynamic properties, hybridizations can 
be carried out at 65 "C, resulting in a more strin- 
gent and rapid hybridization. The decoupling of 
the hybridization step from the mutation detection 
stage offers the prospect of quantification of LDR 
products, as we have already achieved using gel- 
based LDR detection (Khanna et al, 1999), 

Arrays spotted on polymer surfaces provide sub- 
stantial improvements in signal capture, as com- 
pared with arrays spotted or synthesized i>i situ 
directly on glass surfaces (Drobyshev el al, 1997; 
Parinov et al, 1996; Yershov et al, 1996), However, 
the polymers described by others are limited to 
using 8 to 10-mer addresses, while our polymeric 
surface readily allows 24-mer zip-codes to pene- 
trate and couple covalently. Moreover, LDR pro- 
ducts of length 60 to 75 nucleotide bases are also 
found to penetrate and subsequently hybridize to 
the correct address. As additional advantages, our 
polymer gives little or no background fluorescence 
and does not exhiibit non-specific binding of fluor- 
escently labeled oligonucleotides. Finally, zip-codes 
spotted and coupled covalently at a discrete 
address do not "bleed over" to neighboring spots, 
hence obviating the need to physically segregate 
sites, e.g. by cutting gel pads 



Summary and Conclusions 

Here, we describe a strategy for high-throughput 
mutation detection which differs substantially from 
other array-based detection systems presented pre- 
viously In the literature. In concert with a polymer- 
ase chain reaction/ligase detection reaction (PCR/ 
LDR) assay carried out in solution, our array 
allows for accurate detection of single base 
mutations, whether inherited and present as 50% 
of the sequence for that gene, or sporadic and pre- 
sent at 1 % or less of the wild-type sequence. We 
achieve this sensitivity because thermostable DNA 
ligase provides the specificity of mutation discrimi- 
nation, while the divergent addressable portions 
(zip-codes) of our LDR pnmers guide each LDR 
product to a designated address on the DNA 
array. Since the zip-code sequences remain con- 
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slant and their complements can be appended to 
any set of LDR primers, our zip-code arrays are 
universal. Thus, a single array design can be pro- 
grammed 10 delect a wide range of genetic 
mutations. 

Robust methods for the rapid detection of 
mutations at numerous potential sites in multiple 
genes hold great promise to improve the diagnosis 
and treatment of cancer patients. Non-invasive 
tests for mutational analysis of shed cells in saliva, 
sputum, urine, and stool could significantly sim- 
plify and improve the surveillance of high risk 
populations, reduce the cost and discomfort of 
endoscopic testing, thus leading to more effective 
diagnosis of cancer in its early, curable stage. 
Although the feasibility of detecting shed 
mutations has been demonstrated clearly in 
patients with known and genetically characterized 
tumors (Caldas el ai, 1994; Hasegawa et al, 1995; 
Nollau et ai, 1996; Sidransky el al, 1992; Wu et al, 
1994), effective presymptomatic screening will 
require that a myriad of potential low frequency 
mutations be identified with minimal false-positive 
and false-negative signals. Furthermore, the inte- 
gration of technologies for determining genetic 
changes within a tumor with clinical information 
about the likelihood of response to therapy could 
radically alter how patients with more advanced 
tumors are selected for treatment. Identification 
and validation of reliable genetic markers will 
require that manv candidate genes be tested in 
large-scale clinical tnals. While costly microfabn- 
cated chips can be manufactured with over 100,000 
addresses, none of them has as yet demonsfrated a 
capability to detect low abundance mutations 
(Chee et al, 1996; Hacia el al, 1996; Kozal et al, 
1996; Sapolsky et al, 1999; Wang et al, 1998), as 
required to accurately score mutation profiles in 
such clinical trials. The universal zip-code array 
approach introduced here has the potential to 
allow rapid and reliable identification of low abun- 
dance mutations in multiple codons in numerous 
genes. As new therapies targeted to specific genes 
or specific mutant proteins are developed, the 
importance of rapid and accurate high-throughput 
genetic testing will undoubtedly increase. 

Materials and Methods 

Oligonucleotide *yntt>e»l» and purtttcotlon 

Oligonucleotides were obtained as custom synthesis 
products from IDT, Inc (Coralville, lA), or synthesized 
in-house on an ABI 394 DNA Synthesizer (PE Biosystems 
Inc., Foster City, CA) using standard phosphoramidite 
chemistry Spacer phosphoramidite 18, 3'-armno-modi- 
fier C3 CPG. and 3'-f1uorescein CPG were purchased 
(rom Glen Research (Sterling, VA). AU other reagents 
were purchased from f^E Biosystems Oligonucleotides 
with 3'-amino modifications and/or fluorescent labels 
were cleaved from the supports by treatment with con- 
centrated aqueous NH,OH for two hours at 25 "C, and 
deproleclion continued in solution for 24 hours al 25 "C. 
Texas l^ed labeling was achieved by adding 150 m' of 



0.2 M NaHCOj and 2CX) ng of oligonucleotide to lubes 
containing a solution of 500 Mg of Texas Red-X succinimi- 
dyl ester (Molecular Probes; Eugene, OR) in 28 n) of anhy- 
drous DMF Following ovemjghl stirring al 25 "C, Ihe 
majority of unreacted label was removed by the addition 
of 20 hI of 3 M NaCI and 500 pi of cold ethanol, chilling 
in a dry ice/elhanol bath for 3D minutes, and cenlrifuging 
al 12,000 g for 30 minutes. The supematants were 
removed, the pelleted oligonucleotides were washed with 
100 mI of 70% ethanol, and dried. FAMcZipl3-Prd, a flu- 
orescem-labeled 70-mer that simulates a full-length LDR 
product conlaining the complementary sequence to 
Zipl3, was synthesized on 1000 A pore-size CPG 
The sequence was: 5'-fluorescein-CCCACCATAGG 
TCGTCTACCCCTG-ATATAAACTTGTGGTAGTTGG- 
AGCTAGTGGCGTAGCCAAGAGTGCC-3' (the zip-code 
complement is in bold). 

Both labeled and unlabeled oligonucleotides were pur- 
ified by electrophoresis on denaturing 12% polyacryl- 
amide gels. Bands were visualized by UV shadowing, 
excised from the gel, and eluted overnight in 0.5 M 
NaCI, 5 mM EDTA (pH 6.0) at 37 "C Oligonucleotide 
solutions were desalted on C18 Sep-Paks (Waters Cor- 
poration; Mitford, MA) according to Ihe manufacturer's 
instructions, following which the oligonucleotides were 
concentrated to dryness (Speed-Vac) and stored at 



DNA extraction from cell lines 

Cell lines of known K-ras genotype (HT29, wild-tvpe- 
SW1U6, G12A; LS160, G12D; SW620, G12V, DLDl, 
CI 3D) were grown in RPMI culture media with 10% 
fetal bovine senim Harvested cells (-10') were resus- 
pended in DNA extraction buffer (10 mM Tns-HCI 
(pH 7.5), 150 mM NaCI, 2 mM EDTA (pH 80), 0 5% 
(w/v) SDS, 200 Hg/ml proteinase K) and incubated at 
37 "C for four houre. A 30% volume of 6 M NaCI was 
added and the muture was centrifuged. The supematani 
was transferred to a clean lube and the DNA was pel- 
leted through the addition of tFiree volumes of ethanol, 
chilling on dry ice, and centrifugation. The pellet was 
washed with 70% ethanol and reauspended in 10 rnM 
Tris-HCt (pH 7.5), 2 mM EDTA (pH 8,0). 

DNA extraction from pBratfln sections 

Tissue sections (10 ^m) were cut from paraffin- 
embedded colon tumors. Samples were deparaffinized 
via sequential extraction with xylene, ethanol, and 
acetone, and dried under vacuum. The DNA in the 
pellets was purified using a QlAamp Tissue Kit (Qiagen 
Chalsworth, CA). iv 5 . 



Polymer coated slides 

Microscope slides (Fisher Scientific, precleaned, 
3 in X 1 in K 1,2 mm) were immersed in 2% y-melhacry- 
loyloxypropyltrimelhoxysilane, 0.2% triethylamine in 
CHCI, for 30 minutes at 25 "C, and then washed with 
CHCl, (two washes of 15 minutes). A monomer solution 
(20 (j1 of 8% acrylamide, 2% acrylic acid, 002% N,N'- 
methylene-bisacrylanude (500:1 ratio of monomers:cross- 
linker), 0.8% ammonium persutfate radical poly- 
merization imtlafor) was deposited on one end of Ihe 
slides and spread out with Ihe aid of a cover-slip 
(24 rrunx50mm) that had been previously silanized 
(5% (CHjjiSiClj m CHCIj) Polymerization was achieved 
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by heating the slides on a 70 'C hotpble for 4.5 minutes 
Upon removal of the slides from the hotplate, the cover- 
slips were immediately peeled off with the aid of a 
single-<dge razor blade. The coated slides were rinsed 
with deionized water, allowed to dry in open atmos- 
phere, and stored under ambient conditions 

Zlp-cod« arrays 

Polymer-coated slides were pre-activated by immer- 
sing them for JO minutes at 25 "C in a solution of 01 M 
1 •|3-(dimethylammo)propyl)-3-ethylcarbodiimjde hydro- 
chloride plus 20 rrWvl N-hydroxysuccinirrude in 0.1 M 
K;HPO,/KH,rO, (pH 6.0). The activated slides were 
rinsed with water, and then dried in a 65 "C oven, they 
were stable upon storage for six months or longer at 
25 'C in a desiccator over Drierite. 

For manual spotting, 0 2 )il aliquots were token with a 
Rainin Pipefman from stock solutions (500 nM) of zip- 
code oligonucleotides in 0,2 M KjHPO./KH,P0, 
(pH 8 3), and deposited in a 3 x 3 array onto the pre- 
activated polymeric surfaces. The resulting arrays were 
incubated for one hour at 65 "C in humidified chambers 
containing waler/formamJde (1:1). For robotic spotting, 
10-50 nL aliquots of zip<ode oligonucleotides (1.5 mM 
in the same buffer) were deposited at 25 "C on the pre- 
aclivated surfaces by using a robot (PE Biosysterru, "in- 
house" design) equipped with a quill-type spotter in a 
controled atmosphere chamber. Two pairs of 3 x 3 
arrays were spotted on each slide, with addresses con- 
sisting of groups of four spots. Following spotting using 
either method, uncoupled oligonucleotides were 
removed from the polymer surfaces by soaking the slides 
in 300 mM bicine (pH 80), 300 mM NaCl, O.I % SDS, for 
30 minutes at 65 "C, rinsing with water, and drying The 
arrays were stored at 25 'C in slide boxes until needed. 

PCR ampliftcation of K-ras DNA samples 

PCR amplifications were carried out under paraffin oil 
m 20 nl reaction mixtures containing 10 mM Tns-HCI 
(pH 8 3), 15 mM MgClj, 50 mM KCI, 800 dNTPs, 
2 5 mM forward and reverse primers (12 5 pmol of each 
primer; Table 3), and 1-50 ng of genomic DNA extracted 
from paraffin-embedded tumors or from cell lines. Fol- 
lowing a two minute denanjration step at 94 "C, 0.2 unit 
of Toi) DNA polymerase (PE Biosystems) was added. 
Amplification was ac)ueved by thermally cycling for 40 
rounds of 94 X for 15 seconds and 60 °C for two min- 
utes, followed by a final elongation at 65 »C for five min- 
utes Following PCR, 1 Ml of proteinase K (18 mg/ml) 
was added, and reactions were heated to 70 'C for 
ten minutes and then quenched at 95"C for 15 minutes 
One microliter of each PCR product was analyzed on a 
3% agarose gel to verify the presence of amplification 
product of the expected size. 

LDR Of K-ras DNA sampfes 

LDR was carried out under paraffin oil in 20 til 
volumes containing 20 mM Tris-HCI (pH 8 5), 5 mM 
MgCI,, 100 mM KCI, 10 mM DTT 1 mM NAD', 8 pmol 
of total LDR primers (500 fmol each of discriminating 
primers + 4 pmol of fluoresccntly labeled corrunon pri- 
mers), and 1 pmol of PCR products from cell line or 
tumor samples Two primer mixes were prepared, each 
containing the seven mutation-specific primers, the three 
common primers, and either the wild-type discnminat- 



ing primer for codon 12 or that for codon 13 (Figure 1(c) 
and Table 3). 

The reaction mixtures were pre-heated for two min- 
utes at 94 "C, and then 25 fmol of wild-type Tf/i DNA 
ligase was added. The LDJts were cycled for 20 rounds 
of 94 'C for 30 seconds and 65'C for four minutes An 
aliquot of 2 ^ll of each reaction was mixed with 2 n' of 
gel loading buffer (8% blue dextran, 50 mM EDTA 
(pH 8.0), formamide (1:5)), denatured at 94 'C for 
two minutes, and chilled on ice, 1 til of each mixture was 
loaded onto a denaturing 10% polyacrylamide gel and 
electrophoresed on an ABl 377 DNA sequencer at 
1500 volts. 



Hybridization of K-ra» LDR products to DNA arrays 

The LDRs (17 nl) were diluted with 40 mI of 1.4x 
hybridization buffer to produce a final buffer concen- 
tration of 300 mM Mes (pH 6.0), 10 mM MgCI^, O.I % 
SDS, denah4red at 94 "C for three minutes, and chilled 
on ice. Arrays were pre-incubated for 15 minutes at 
25 'C in 1 X hybridization buffer. Coverwells (Grace, Inc, 
Sunriver, OR) were filled with the diluted LDRs and 
attached to the arrays. The arrays were placed in 
humidified culture tubes and incubated for one hour at 
65 'C and 20 rpm in a rotating hybridization oven. Fol- 
lowing hybridization, the arrays were washed in 
300 mM bicine (pH 8 0), 10 mM MgCI^, 0.1% SDS for 
ten minutes at 25 "C. Fluorescent signals were measured 
using a microscope/CCD (see below). 

Hybridization of synthetic tDR products to 
DNA arrays 

Quadruplicate hybridization mixtures were prepared 
containing 100 amol, Ifmol, 3 fmol, 10 fmol, or 30 fmol 
of FAMcZipl3-Prd (a synthetic 70-mer LDR product 
complementary to zip-code 13) combined with 4500 fmol 
of total fluorescein-labeled common LDR primers and 
9 X 500 fmol of each unlabeled, zip-code-contaming 
discriminating LDR primer in 55 jil of 300 mM Mes 
(pH 6 0), 10 mM MgCI,, 01 % SDS. Hybridizations were 
conducted according to the protocol described above, 
and Fluorlmager as well as epifluorescence microscopy 
data were acquired and analyzed (see below) 

LDR and hybridization of G12V/G12 dilution series 
to DNA arrays 

These experiments were carried out in a volume of 
20 nl The PCR-amplified SW620 cell line DNA contain- 
ing the C12V mutation was diluted from 5 nM 
(100 fmol = 1/20) to 0,050 nM (1 fmol = 1/2000) in LDR 
nuxtures containing 100 nM (2000 fmol) of wild-type 
(G12) DNA and 100 nM (2000 fmol) of both Gl2V-discri- 
rmnaling primer and Texas Red-labeled common primer. 
The LDR and hybridization proceeded as above, and 
imaging on the microscope/CCD was carried out as 
detailed below. 



Image analysis 

Arrays were imaged using a Molecular Dynamics 
Fluorlmager 595 (Sunnyvale, CA) or an Olympus AX70 
epifluorescence microscope (Melville, NY) equipped with 
a Princeton Instruments TE/CCD-512 TKBMl camera 
(Trenton, NJ). For analysis of fluorescein-labeled probes 
on the Fluorlmager, the 488 nm excitation was used with 
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a 530/30 emission filter. The spatial resolution of scans 
was 100 Mm per pixel, The resulting Images were 
analyzed using ImageQuaNT software provided with 
the instrument. The epifluorescence microscope was 
equipped with a 100 W mercury lamp, a FITC filter cube 
(excitation 480/40, dichroic beam splitter 505, emission 
535/50), a Texat Red filler cube (excitation 560/55, 
dichiroic beam splitter 595, emission 645/75), and a 
100 mm macro objective. The macro objective allows Illu- 
mination of an object field up to 15 mm in diameter and 
projects a 7 mm x 7 nun area of the array onto the 
12.3 mm x 12,3 mm matrix of the CCD, Images were 
collected in 16-bil mode using the Winview32 software 
provided with the camera. Analysis was performed 
using Scion Image (Scion Corporation, Frederick, MD), 
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We have fabricated a flow-through biochip assembly tJiat 
consisted of two different microchips: (1 ) a polycarbonate 
(PC) chip for performing an allele-»pecific ligation detec- 
tion reaction (LDR) and (2) a poly(methyl methacrylate) 
(PMMA) chip for the detection of the LOR products using 
on universal array platform. The operation of the device 
was demonstrated by detecting low-abundant DNA muta- 
tions in gene fragments (K-ro«) that carry point mutations 
with high diagnostic vaJue for colorectal cancers. The PC 
microchip was used for the LDR in a continuous-flow 
formal, in which two primers (discriminating primer that 
carried the complement base to the mutation being 
interrogated and a common primer) that Oaiiked the point 
mutation and were ligated only when the particular 
mutation was present in the genomic DNA. The miniatur- 
ized reactor architecture allowed enhanced reaction speed 
due to its high surface-to-volume ratio and efficient 
thermal management capabilities. A PMMA chip was 
employed as the microarray device, where rip code 
sequences (24-mcrB), which were complementary to 
sequences present on the target, were microprinted into 
nuidic channels embossed into the PMMA substrate. 
Microfluidic addressing of the array reduced the hybrid- 
ization time significandy dirough enhanced mass transport 
to the surfacc-tediered zip code probes. The two micro- 
chips were assembled as a single integrated unit with a 
novel interconnect concept to produce the flow-through 
microfluidic biochip, A microgasket, fabricated from an 
elastomer poly(dimethylsiloxane) with a total volume of 
the interconnecting assembly of <200 nL, was used aa 
the interconnect between the two chipsjp produce the 
three-dimensional microfluidic network jwe successfully 
demonstrated the ability to detect one mutant DNA in 1 00 
normal sequences with the biochip assembly. The LDR/ 
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hybridization assay using the assembly performed the 
entire assay at a relatively fast processing speed; 6.5 min 
for on-chip LDR, 10 min for washing, and 2.6 min for 
fluorescence scanning (total processing time 1 9, 1 min) 
and could screen multiple mutations simultaneously. 

With completion of the sequencing of Ihe humim genome, new 
geni-s are beinu discovered ai an accelerated pace as well as 
delermining llie function of Ihese genes (funcrionnl genomics) 
and poienlial associaUon of Ihese Kenes and mutations mlhin Ihem 
CO particular phenotypes (disease states). Efforts in functional 
genomics have produced an array of new diagnostic markers for 
clinical staging (prognosis), early deleclion, and predicting/ 
monitoring the course of treatments for many genetic-based 
diseases. In most cases, a panel of mutations must be evaluated 
10 ol)tain an accurate diagnosis/prognosis of thai disease Por 
example, colorectal adenomas and cancers have been determined 
to possess point mutations in K-roj genes (Ifl different mutations), 
which can occur early in the development of the disease in nearly 
30- 50% of palif nts,' * Most of these K-rsj mutations are localized 
on codon 12 and lu a lcs,Her degree al codons K! and fi) » " Once 
acquired, K-ros niulalions are conserved throughout the course 
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(if tumor development For example, a single base substitution 
wiihm codon 12 (GCTD in exon 1 of the K ras gene may mutate 
10 GAT. GCT. or C7T. being translated into the specific amino 
acid residues Asp, Ala, and Va), respectively. Tliese iujiino acid 
residues play a critical role in GT? binding, and point muudons 
in these codons produce oncogenic p2! ras proteins that resist 
GTP hydrolysis and have cotisfitutively active signaling functions,'-' 
A major hurdle toward mutation detection is thai the mutation 
of interest (mutant DNA) may be present in low copy numbers 
(minority) in a mixed population of higher copy number wild- 
type DNA (majority). Even at tlie primary tumor sile for many 
cancels, the normal stromal cell content can be as high as 70%. 
Therefore, if the mutation is found In only one of the two 
chromosomos of a tumor cell (heteroiygous) , the mutated UNA 
can be present as low as 15%,'' This number goes down 
precipitously if the sampling is done away from the primary tumor 
site. Thu.s, Ihcre is a need to develop technology that can identify 
accurately one or more low-abundant mutations at multiple 
adjacenl, nearby, and distal loci in a large number of genes 
(multiplexed assays) with samples containing high stromal cell 
infillralion. 

One technique lhal an distinguish low-abundant mutajil DNA 
from wild-type DNA is the ligase detection reaction (U)R) coupled 
to a primary PCR,'" A conceptual schematic of the f'CR/lDR 
technique is depicted in Figure 1. Following PCR amplification of 
the appropriate gene fragments, which contain sections of the 
gene(s) that possess the point mutations, the iraplicon is mixed 
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with two LDR primers that Dank tJie mutation of interest (common 
primer and discriminating primer) The discriminating primer 
contains a base al its 3'-end thai coincides with the singlt^base 
mutation sile If there is a mismatch, ligation of the two primers 
does nol occur However, a perfect match results in a successful 
ligation of the two primers and produces a product that can be 
analyzed in a vanety of fashions. The advantages of PCR/1J)R 
are that it can be configured to do highly multiplexed as.says and 
uses a thennally suble ligase enzyine to linearly amplify the LDR 
product 

Recently, attention has focused on developing microfabricated 
devices for biological amplifications that require temperature 
regulation, such as PCR'""** and didenxy cycle sequencing," since 
they can offer lower thermal capacitance, require smaJler amounts 
of reagents for the reaction, possess the potential for automation, 
and be integrated to subsequent processing steps configured on 
chips to improve automation and minimize sample contamination, 
which is extremely important to circumvent in clinical settings 
for early detection of a disease. During the past decade, a number 
of groups have designed chamber-type PCR chips, where a 
stationary PCR mixture in a confined space is alternately healed 
and cooled.'" "' Altemalively, DNA amplification can be achieved 
in a microfiuidie platform by shuttling a PCR cocktail in a 
microchannel repetitively through different isothermal zones using 
a continuous-flow (CF) format.*'"" The CFPCR approach can be 
conducted at relatively high speeds since it is nol necessary lo 
heal and cool the large thermal mass associated with the 
amplification chamber repeatedly. We have developed a unique 
spiral microchannel with 20 loops hol-embossed into polycarbon- 
ate (PC) for rapid CFPCR '« •» (Jltra/ast PCR was demonstrated 
with the speed of the reaction determined primarily by enzyme 
kinetics (AmpliTaq polymerase) using a (low velocity of 15 mm 
3"' resulting in successful amplification of a 5(X)-bp fragment in 
1.7 min at a iTcle rate of 5 s cycle ' " 
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DNA microairays cart be configured to detect sequence 
variations at many different sites tiimultajieously for potential 
diagnostic applications, Recently, universal zip code arrays 
have been developed for monitoring products generated (rom 
allele-specific reactions, such as I.DR.i^-'«-«Tlie array formal (see 
Fiffure I) uses small probes that serve as zip codes (24-mers wiUi 
similar T,, values) thai contain unique sequences not found in 
the sample DNA template The PCR/I,DR uses discriminating and 
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common primers similar to that described above. However, the 
allele^specific primer contains on ils 5'-end a zip code complement 
that directs the ligation product to a particular address on the 
array The common primer contains a fluorescent dye on ils 3'- 
end, if the mutation is present, LDR ligates the two primers 
together and generates a fluorescence signal at the appropriate 
location of the array. The attractive feature of these universal 
arrays is thai they can be used to delect a variety of mulaliuns by 
simply appending the correct Jip code complemcnl sequence to 
(he discriminating primer used in the I.DH slep. 

Several reports have discussed merging microarrays with 
microfluidics. The attractive features of Ihis marriage include the 
reduced amount of sample required to address each clement of 
the array, the enhanced mass transpon to the array surface 
reducing analysis time, the ability to monitor several samples 
simultaneously using multichannel chips, the abilily to integrate 
several preprocessing steps into the microfluidic device, and the 
closed architecture of the micronuidics reducing the potential for 
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Krfls exon 1 revenw 
Krojcl2Com2 

i:Zip3-K-rajcl2 2D 

c/.ip,'i-Krajcl2.2A 

capll-K mi (.i2 2V 

zip code I 

2ip ClXlf 3 

zip cotje & 

zip cwric 1 1 



seq\ienceB (5'— 3') 
TTAAMGOTACrGGTClCAOTATrrOATA 

AAAA'ixxrrcAGAGAAACCTrrATcrrrr 

p'rcGCG'rACGCAAGAGTGCCTDY782* 

'GcrGAc;GTCGATGCTGA(;oTC(;cAAAAcrnrr(;(riAG'rrGGAt,cix;G 

'GCrGCCATCGATGCTVAGGTGCTGAAACnxnCCIAGTrGGACCTGA 

'GCTXJTACCCGATCGCAAGOTGGTCAAAC[-Hnx;{;iAG-I-|X,(iA(iCT(iC 

'C'GCAAG(;TAGGTGCTGTACCC(;CAAAACTTGTC(;TAC;ri'GGA GCICT 

TCC(;ACt-rCA(;CATCGACax:AGCspacer-NH/ 

CAGCACCTCACCATCCATCGCAGC-apacerNH/ 

GACCACCrrGCGATCCiGGTACAtlCspacerNH;'' 

TCCGGGTACAGCACCTACCITGCGspaterNH}'' 



r„ ("C) 

56,0 
56.8 
625 
74.6 
74,3 
74 7 
74 5 
«i4 
655 
65 1 
S6 7 



' s/J^-ut^ZTnl^M^^^^ complements ,o (he ^quences ol zip .ode probe,, 



sample contaminntion, Anderson and co-workers developed a 
highly integrated monolithic device that automatically carried out 
a complex series of molecular processes on multiple samples ** 
The device could manipulate over 10 reagents in more than 60 
sequential operations and was tested for the detection of inutations 
in a 1,6-kb region o( the HIV genome from serum samples 
containing as few as bm copies of the wrgcl RNA. Check et al 
developed three-dimensional flow-through arrays in glass chips 
by fabricating channels {10-«m i d 1 that were 39 pi, m volume 
and 5 /im in radius with the DNA probes printed on the channel 
walls,*" I.iu and cinworkers fabricated a plastic (7'C) chip that 
integrated inicrofluidic mixers, valves, pumps, channels, cham- 
bers, healers, and DNA inicroarray sensors " Ttic hybritJization 
probes were tethered to the surface of gold electrodes via self- 
assembly to delect electrochemicai signals generated from hybrid- 
ized targel DNA. The analysis required, from loading sample and 
different reagents into storage chambers to obtaining the hybric}- 
izatign resulus, nearly 3,5 h of processing time. One potential 
problem assocrated with the use of PC as the microarray platform 
is that it generates a significant amouni of autoOuorescencc, which 
can poteniiaUy degrade detection limits when fluorescence readout 
is used,** In addition, PC can produce large levels of nonspecific 
adsorption in its native stale," 

In this paper, we report the development of a polymer flow- 
through biochip assembly thai consists of a conrinuous-llow LDR 
(CFLDR) microchip aiid a microarray chip fnr the detection of 
low-abundant mutations in genomic DNA, We chose PC as the 
material for CFLDR due to its high glass iransilioii temperature 
(M5- 148 'O allowing il to withstand the sustamcd high operating 
temperatures associated with LDR {-^95 'C for thermal denatur- 
ation) On the other hand, the matenai o) choice for the universal 
array chip was poly(methyl methacrylale) (PMMA) because 
PMMA has significantly lower amounts of autofluorescence 
compared to PC as well as minunal nonspecific adsorption artifacts 
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between PMMA and singl^sn-anded DNAs In addition, we have 
outlined robust procedures for covalcntJy attachmg oligonucleotide 
probes to the surface of PMMA a( high concentrations,*"^ '^ 

The need to integrate specific functions onto different chip 
materials through a three-dimensional microfluidic network led 
us to develop a simple, low-volume interconnect technology that 
provided robust leakage-free connection with minimum dead 
volume between the chips. We utilized an elastomer, poly- 
(dimelhylsiloxane) (PDMS), as an 0-ring gasket between the 
chips in order to connect laser-drilled microchannels between the 
two chips, In this work, a PCR/lJ)R/hybridlzation assay was 
carried out on K-ras genes to detect the presence of point 
mutations possessing clinical relevance lor diagnosing colorectal 
cancers. The PC chip used for LUR employed a CF format with 
isothermal zones,*''** Using presynthesized oligonucleotide probes, 
the array containing the zip code probes was microprinled into 
microfluidic channels hot-embossed in a PMMA chip. Following 
production of the LDR and array microchips, the array was 
embedded into a machine-milled pocket in the LDR chip with the 
PDMS gasket for the conatruclinn of the three-dimensional 
microfluidic network, 

EXPERIMENTAL SECTION 

Reagents and Materials. PC and PMMA used a,s the 
microfluidic chip substrates were purchased from GoodFellow 
(Bcrwyn, PA). Chemicals used lor PMMA surface modification 
and hybridization assays, including n-butyllithium, elhylenedi- 
ainme, ,50 wl % glutardialdehyde, sodium borohydride, sodium 
cyanoborohydride (5.0 M solution in aqueous -1 M sodium 
hydroxide), and 20x sodium chloride-sodium phosphate- FDTA 
(SSPE) buffer, were purchased from AJdrich Chemical (Milwau- 
kee, WI) and used as received. A 10% SDS stock solution, which 
was used lor posLhybridization washing, was received from 
Ambion (Austin, TX) Oligonucleotide probes and primers were 
obuiined from two differenl sources, Integrated DfVA Technologies 
(Coralville, L^) and Synlhegen (Houston, TX). Their sequences 
and melting temperatures (TJ are listed in Table I. Thermus 
IhermopMIvs {Tth) and Thermus aqualkus (Tai/) DNA ligases were 
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obtained from ABgene (Rochester, NY) and New England Biolabs 
(Beverly, MA), respectively. 

Chip Fabrication. Microchips were fabricaled using methods 
previouaty reported in our group »«' Brieny, the procedure 
involved fabricating a meul molding die by UGA. which consisted 
of raised Ni microstnicliires electroplated from a Nj suKamate 
bath onUi a stainless sleet base plate. The topographical layout of 
theCFmRchip is depicted in Figtjre 2A. The spiral channel (~1,6 
ni long), which consisted uf 20 loops, was 50 ^im in width, 150 
/jm in depth, anrt a 250,/jm Iniei'channcl spacing. The topology of 
the microarray chip is shown in Figure 2B Fluid access channels 
(T)NA sample and wash buffers) were 100 fin\ in width and 50 
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//m in depth Both channels merged into one common channel 
and emptied into the hybridization chamber, which measured 500 
/jm in width, SO/im in depth, and 6,7 mm in length (total volume, 
188 nU, Replicates from the molding die were hot-embossed into 
the plastic substrates. The embossing system consisted of a PHI 
Precision Press model number TS-Zl-H-C (4A)-5 (City of lndusU7, 
CA) A vacuum chamber was installed into this press lo remove 
air (pre.ssure, <0,1 bar) to minimize replication errors. During 
embossing, the molding die for the spiraj microchannci was 
heated lo 190 "C and pressed into the PC wa/er with a force of 
il50 lb for 5 niin. The mirrochannel pattern for the array chip was 
hot-embossed into a PMMA plate at 155 'C and 1000 lb for 4 min 
A/ter hot-embossing, the press was opened and the polymer pan 
removed and tooled lo room lemperaturf , 

Dnlling of iiilerconnecting niicrochannels ai the ends of the 
hol-emboased microchannel.s was perfiirmed using a KrF laser 
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system (RapidX KXKl Sfries, Resonetics Int-,, Nashua, NH) with 
a laser (lueiice ai Ihe workpiecf uf ~I0 J/cm' and a repelition 
rale ii) 50 Hz Mitrographs of the laser-drilled holen flOO- and 
270-/(m i d.) arp shown In Figvrf 2C and D. 

Millinu of the pockel on Ihe backside of the LDR microchip 
(or tfie loading of the rmcroarray chip was performed using a Kern 
MMP 2522 micromillmg machine (Kern Micro- und Feinwerk- 
tcchnik GmbH & Co Kfl, Muraau, Germany) Milling was 
pcrfonncd using a 500-,um milling bit at 40 000 rpni to provide a 
smooth finish to the surface, which was necessary to obtain a 
good seal between the microOuidic chips. 

The embossed PMMA airay chips were assembled by heat 
annealing a cover plate made from the same material to the 
substrate at 107 «C for 20 min. The rover plates {500 /m thick) 
and subslralea (2 iimi thick) were clamped together and placed 
in a conyectKin oven Sealing of the I'C spiral microchannel with 
a cover plate was more critical due to the elaborate pattern of the 
microchannel topography and the higher glass trwisition temper- 
ature of K Tlie embossed subsu-ate (4 5-mm thickness) and the 
cover plate {2iO-m thickness) were introduced into the emboss- 
ing machine, and the assembly was healed to 160 'C tor 15 mm 
under vacuum for a tight seal of PC to PC 

Surface Modification and Array Preparation, The hot- 
embossed PMMA microchannels were treated (prior to chip 
assembly) with a procedure that was slighcfy modified from our 
previously published method.*"* A W-lithioethylenediame solu- 
tion was prepared by dissolving «-butyllithium in eiliylenediamine 
al an equal volume ratio (50:50) and magnellcally agitating the 
solution until il turned dark purple., PMMA nncrochannels were 
aminated by immersing them in the Mlithioelhylenediamine 
solution at room temperature for 5 min under nitrogen atmo- 
sphere After amination was complete, the chips were thoroughly 
rinsed with doubly distilled HjO. Next, the aminated chips were 
soaked for 2 h in a .5% glutardialdchyite (cross-linking agent) 
solution with MftM sodium cyanoborohydnde in phosphate buffer 
(0,5 M. pH 6,4). Zip code oligonucleotide probes I, 3, 5, and 11 
(see Table 1 for sequences) were dissolved separately in 0,2 M 
phosphate buffer (pH B.'i) to a final concenp-ation of 10 ^M, The 
arrays were printed onto the activated PMMA The volume of 
probe (zip code sequence) deposited was !00 nL, which resulted 
in a spot of -"100 ^ra in diameter. A spot for each probe was placed 
along the length of the hybridisation chamber (linear array). After 
attachment of the oligonucleotide probes, excess surface alde- 
hydes were capped with 0.25% (w/v) sodium borohydride solution 
in phosphate buffer (0.1 M, pH 6.1). Following deposition of the 
four probes and capping, the chip was assembled by placing a 
cover plate over Ihe microfluidic channels, clamping the assembly 
together, and thermally annealing using die procedure described 
above 

Near-fR l.»ser Scanner, Near-IK fluorescence measuremenLs 
were made from the arrays with a device built in-houae that ha.'i 
been described previously « Briefly, the ncar-lR scanner consisted 
of a diode laser (PicoQuant GmbH, model 800, Berlin, Germany), 
counting electronics (PicoQuant GmbH, model SPC 430), and a 
single-photon avalanche diode (.SPAD, F-G&G Optoelectronics, 
model SPCM-PQ, Vaudreuil, Canada). The components were 
mounted with the aid of a mounting cube and lens tubes 
purchased from ThurUbs (Newton, NJ) and configured in an epi- 
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illumination lormal. The diode laser operated al a wavelength of 
780 nm with 7 5 mW of average power. An liiteBrated optics set 
was provided with the laser to produce an elhplically shaped 
collimaled output. The laser excitation beam was passed Oirough 
a 780nm line filter (Omega Optical, 780DF10, Braltleboro, VT), 
reflected by a dichroic mirror (Omega Optical, 795DRLP) and 
focused onto the array surtacc using a 40 x high numencal 
aperture (NA) microscope objeclive (Nikon, NaUck, MA. NA = 
0.85), The l/e^ spot size of tJie excitation beam was measured to 
be 3 /im (minor axis) by 5 «m (major axis). The fluorescence 
excitation was collected by Uie same microscope objective, 
transmitted through the dichroic, a ciifular aperture set al 2,0 
mm (dear diameter), and finally through a filter stack consisling 
of a long-pass filter (cut-on wavelength, 830 nm; Newport Corp , 
Irvine, CA) iind a band-pass filter centered at 825 nm (825RPf-'30, 
Omega Optical) After passing through the filters, the fluorescence 
was sent through a condensing lens (OIIAGI H/076, Melles 
(iriol) and focused onto a single photon avalanche diode 

The entire fluorescence detector was mounted on an X/Y 
niicroiranslational sugc. Two bipolar stepper motors interfaced 
to a PC using STP-100 stepper motor controller boards obtained 
from Pontcch, Inc. (Upland, CA) drove the X and ('directions of 
the microtranslational stages. Each STP-100 was equipped with 
the R&485 interface allowing full duplex, multidrop communicadon 
with the host computer. A PP232-485( interface fPontech, Inc.) 
was used to convert RS485 into the PC's RS232 protocol. Hall 
sensors were used to monitor the travel limits of the -f/ K stages. 
The step resolution of these stages was either 25.4 or 50.8 /im 
with a maximum scan range of 4 cm in both the A' and K 
coordinates. The scanner operated by taking a single step and 
then acquiring Ihe fluorescence data for a software-selectable 
integration period (10 ms-lO s). 

The data acquiiiition software was written in Visual Basic and 
consisted of several control and data acquisition functions such 
as recording die posihon of the scanning head, streaming data to 
the hard drive, and providing real-dme visuaJiration of the acquired 
images. The size of each data file was determined by Ihe number 
of pixels included in the image file (set by the stepping resolution 
cuid area imaged) with four bytes representing the intensity at 
each image pijiel. 

Extraction of DNA fi-om Cell Lines, Genomic DNA was 
extracted from cell lines of known K-ras genotype (HT2y, wild- 
type; ,SWI116, G12A LSI80, G12D; SW620. GI2V).f Cell lines 
were grown in RPMI culture media with 10% bovine serum. 
Harvested cells (-1 » 10') were resuspendecj in DNA extraction 
buffer (10 mM Tns-HCI, pH 7.5, 1.50 mM NaCI. 2 mM FI)TA, pH 
8.0) conlaining 0 .516 SIJS and 200 ng/ml proteina.se K and 
incubated st 37 "C for 4 h. Thirty percent (v/v) of fi M NaCI was 
added to the mixture, and the samples were centnhiged UNA 
was precipitated from the supernatant wilji three volumes of FtOH, 
washed with 70% FtOH, and resuspended in TF buffer (10 mM 
Tris-HCl, pH 7.2, 2 mM EDTA, pH 8.0). 

PCR Ampliflcatlon of Genomic DNA, PCR amplifications 
were carried out using a commercial thermal cycling machine 
(Techne, Burlington, NJ) in 50 «!. of 10 mM Tri.s-HCl buffer (pH 
8.3) conuining 50 mM KCl, 15 mM MgCl;, 200 wM ilNTPs, 200 
nM forward and reverse primers, and between 1 and 50 ng of 
genomic DNA extracted from ihe cell lines. The primers used 



were as follows: forward = 5' TTA AAA GOT ACT GGT GGA 
GTA nr GAT A 3'; reverse = 5' AAA ATC; GTC AGA GAA ACC 
I'l l' ATC TGT 3'. After a 2-mm denaturation step, 5 0 unila of 
Aniplitiiq DNA polymerase (Perkln-Klmer, Nonvalk, CD was 
added under hoi start condilions and amplUication achieved by 
Lhcmial cycling for 30-4(1 cycles al 95 "C fur 30 a, 60 "t for 2 
mm, and a final exlensiuii at 72 °C for 3 inin. PCR priidiicls were 
quantified by uhsorbance al 260 nin aiiit stored al -20 "C until 
required for the LDR assays. 

l-DR and Electrophoresis of U)R Products. Reference 
UDRs were executed in a total volume of 100 ul in 02-inL 
polypropylene microtiibea usintt a commercial thermal cycling 
machine (Genius Series 96-well Thermal Cycler. Techne, Min- 
neapolis, MW. The reaction cocktail typically employed in this 
work con.sistf d of 20 mM Tris-HCl (pH 8.3), 25 mM KCl, 10 mM 
MgCI-i, 0 5 wM NAil' (nicotinic adenine dinudeolide, a cefaclor 
for ligase enzyme), 0M% Triton X-100, 30 nM discriminating 
pnmcrs (7.5 nM each), 30 iiM coin-2 fluorescently labeled primer 
(see Table 1 for sequences of primers), mixture of PCR products 
(wild-type and mutant DNA), and 2 units/^L of ligase enzyme 
The concentration ratio of the mutant-to-wild-type DNA was 
adjusted from 0 I (mulant/wild-type, control) to 1.1000 li is known 
ihat incorporation of bovine senim albumin (BSA: -0,5 ftg/fil.) 
inU) a reaction mixture is essential in order to avoid deactivation 
of the bioenzyme due lo its nonspecific adsorption to PC 
surfaces,'^' Our studies indicated that this procedure also could 
effectively prevent thf majority of nonspecific adsorption of ligase 
eniiyme to PC used as our substrate material for CFI.DR." The 
l.DR mix was preheated to 94 "C for 2 min and then subjected to 
20 LDR thermal cycles using the following temperatures; 94 'C 
for 30 s; 65 "C for 15 s-2 min. To test the fidelity and yield of the 
LDR reaction, slab /?cl electrophoresis was run on an aliquot of 
each reaction fl /iL of LUR product was mbted with 2/(L of loading 
dye a^id then I ul of that mixture was loaded into an individual 
well of a slab gel). 

Electrophoresis was accomplished using a 5.5* fw/v) cross- 
linked polyacrylamide gel (l.i-COR Biotechnology, Lincoln, NE). 
The gel was polymerized between two borofloat glass plates (21 
cm X 25 cm) and placed in the Global IR' DNA annJysis system 
(U-COR Biotechnology). The electrophoresis was typically run 
at - 1500 V for 2 h The fluorescence bands were integrated over 
each separation lane with FmageQuant software (Amersham 
Biosciences, Piscataway, NJ) 

Operational Protocol of the Assembled Biochip, Panels E 
and F in Figure 2 show a schematic diagram and picture of the 
flow-through biochip assembly The construction of the biochip 
was conducted in the following manner A PDMS 0-nng (~l- 
mm i.d . -3-mm o,d.: 500 itm thick) (3) used as a gasket was 
placed on lop of the laser-drilled microchannel (IflO/^m diameter) 
(4) of the I..DR chip (1). The PMMA array microchip (2) was next 
carefully loaded inlo the mechanically milled pocket of the (.DR 
chip so thai the two microchannels (4) and the PDMS gasket (3) 
aligned properly in order to provide the flow-through microfluidie 
network The array chip was tightly pressed onto the l.DR chip 
using mechanical clamps (6) to sea) the two chips between the 
PDMS gaski'l. allowing a leak-free connection, which was dclci- 
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mined by flowing fluorescent dye (fluorescein) through the 
interconnect (data not shown). Clamping pressed the PDMS 
gasket to ~400-/(m thickness and -^lOO-^m-i d. ihrough-hole, 
leading to a total volume of the interconnect assembly flaser-drilled 
microchannels and gasket) of *200 nl. (-1 6% of CFl.DR reactor 
volume). Film resistance healers (KHLV-101/10. Omega Rngi- 
ncering. Inc., Stamford, CT) (7) were attached to the cover plate 
uf the LDR chip and the array chip for thermal control of 
hybridization reactions. Three capillary lubes (75-/<m i d,: 36.3-/im 
o.d.; 18 cm long, Polymicro Technologies, Phenix, AZ) (5) were 
then affixed to the chips lo aid in loading. 

A syringe pump (Pico Plus, Harvard Apparatus, Holliston, MA) 
was used to drive the l.DR mbtture through the spiral microchan- 
nel via a capillary tube (5a) A glass syringe (Hamilton, Reno, fW) 
with a syringc-to^;apillary adapter was used to make the connec- 
tion between the pump and the microfluidie assembly Temper- 
atures were maintained during operation using the heaters under 
closed-loop P(D control (CN77R340, Omega Engineering, Inc.) 
Temperature feedback was supplied through type-K thermo- 
couples (5TCrr-K-36-36, Omega Engineering, Inc ) mounted 
between the cover plates and heaters. The arrangement of 
temperature zones on the spiral channel (95 °C (or denaturing 
and f>5 'C for annealing and ligation) is depicted in Figure 2A 
The resultant U)R products were directly pumped into tlie 
hybridization chamber via the laser-drilled micTochannels (4) and 
the PDMS gasket (3) and subjected to hybridi?jtion to the surface- 
teUiered zip code probes at 55 "C. The hybridization chamber 
was flushed svith a wash buffer (2« SSPE-0 1% SDS) al a 
volumetric flow rale of 4 4 //I./niin, and the array chip was finally 
imaged with the near-IR laseninduced fluorescence scanner. 

RESULTS AND DISCUSSION 

EfTects of Flow Rate on LDR Product Yield. The speed of 
thermal cycling is usually determined by theniial conduction and 
mass of the polypropylene containers and the hi;ating block, For 
example, commercial PCR machines are based on temperaturt^ 
controlled metal block holding tubes containing the PCR cocktail 
that is thermally cycled during the PCR. Standard protocols for 
30 thermal cycles can require > 2 h of processing time with a large 
fraction of that time required to bring the PCR cocktail to the set 
temperature due to the need for bringing the large metal block 
to the cycle equilibrium temperature and to transfer heal In the 
cocktail through the microfugc rubes Therefore, the cycle time 
is set by the thennal capacitance of the metal block and the heat 
transfer through the plastic microfuge tubes Theoretically, 
thermal cycling can be earned out much more rapidly if the 
sample volume Is small, the container wall Is thin, there is 
reduction in the thermal mass required to be heated/cooled, or 
the surface-to-volume ratio of the sample reaction chamber is high. 
Hence, the use of a microreaction chamber operated in a 
continuous-flow fonnat is very attractive due to the high surface- 
to-volume raPo of the device and the fact that the entire system 
is brought lo thermal equilibrium prior to operation (i.e., except 
for (he small lluid packets that are transferred from isothermal 
zone to another zone). A feature of the spiral microchannel in 
this work is its high surfactMi>-volume raUu (SVR, 623 mni^ volume, 
Vi iiii SVK = 51,9 mm ') This is much higher lhan a conventional 
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pliislit reficUon lube (77 inm^, SO^^L; SYR = l.S mm ').'" Another 
advantage of ihe c-onlinuous^llow (oimat employed In this work 
la that 11 Is not necessary to heal or cool the amplification vessel 
repeateiJIy, allowing thermal cycling to be carried out at relalivety 
high speeds 

It is known thai denaturing and renaturation are almost 
instantaneous Oess than 1 s) ^ Therefore, IJ)R cycle times should 
ultimately be limited by the ligation time, which is determined 
primarily by cnityme kinetii-s. Figure 3 shows the influence of on- 
chip ligation time, which is determined by the dwell time of the 
LDR cocktail within the 65 °C zone (refer to Figure 2A) on the 
LDR product yield, ligation times of 2, 1, 0.5. and 0 25 min were 
investigated using volumetric flow rates of 0.22, 0 45, 09, and I.S 
><l./min. respectively, to control the dwell time within the ligation 
Kone It should he noted thai a sufficient dwell time was generated 
for rienaturation even when the flow rale was 1 8 wl./min 
(denaluration lime at ifie innermost loop <2.5 s). 'Hie resultant 
I J")R products were collected into microfuge lubes from the oudel 
of Uic spiral channel and analysed using a 5 .5% cross-linked 
polyacrylamide gel. f-luorescence from the product was imaged 
by the gel electrophoresis instrument (Figure 3A) iind the 
resultant band integrated over each separation lane with Im- 
ageQuant software for quantification fFigure .IB). Using the 
ligation times employed in this series of measurements, the longer 
ihe ligation time, the larger the amount of product that was 
obtained For example, the product yield with a 2-min ligation time 
was 5x larger than that seen with a 15-s ligalion time. However, 
the 2-min ligation dme required an extra processing time of 45 5 
mill compared lo a 15-3 ligalion lime (52 min with a 2-min ligation 
and 6.5 min with a 15-s ligation time for a lotaJ of 20 I.DR c7cles). 

Coniparnble amounts of product lor Ihe micro<hip and the 
reference thermal cycler could be obtained using a 4-min ligation 
time (data not shown) However, the difference in product yield 
between on-chip I.PR and the reference IDR became larger as 
the ligation time was reduced (sec Figure 3C) TTic on-chip LDR 
product yields were 74 and 52% of thai obtained from Ihe reference 
thermal cycler at ligation times of 2 min and 15 s, respeetivefy 
The lower net LDR yield at higher flow rales results from (hennal 
nonequilibnum condilions of Ihe fluidic packet traversing through 
tJie isothenna) loncs with the set temperature. Finite element 
simulations in our previous studies** indicate that the reaction 
mixture reaches thermal equilibrium with the set temperature at 
a flow rale of 0.22 «l./min, which provides an effective residence 
lime of ~-2 min within the nominal ligation zone. A llow rate of 
l.R /(L/min corresponds to a 15-s effective residence time and 
requires at least I s to reach the set temperature Although the 
on-chip IXIR produced slightly lower product yields using short 
ligapon times, ihe processing lime was shorter lhan that of the 
reference cycler. The total proce.ssing time for 20 thermal cycles 
was fi.S min using the CFLDR microchip (3 fi-s denaluration, 15.« 
s for renaturation. ligation, and heal iransilion; cycling rate -19 5 
s/c7Cle) while -25 min was required using the reference cycler 
(30-s denaluration. 15 s for renaturation and ligation, and -29 s 
for heat U-ansiiioii; cycling rale of ~74 s/cycle), when the same 
nominal ligation time of 15 s was selected for both platforms. 
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On-Chip Multiplexed LDR. The sensitivity of PCR/LDR was 
determined by reconstructing samples containing known amounts 
of mutani MNA (derived from cell lines) with a fixed copy number 
of wild-type UNA. Samples (mutani f7l2r), which is the most 
commonly found raulation m K-raj, and wild-type G 12) were PCR 
amplified uidcpendently and then mixed in known copy numbers 
with the LDR carried out in the continuous-flow microchip allowing 
a range of mutant-to-wild-typc ratios Co be evaluated. Wc examined 
signals generated from correct ligation of the mutani template 
and backgrounds generated Irom niisligations occurring from the 
viild-type template using a multiplex LDR primer set (discriminal- 
ing primers lo deled G12D, G12A, and G12V and DY782-labeled 
common primers; see the left scheme of Figure 4A), Panels A 
and B in Figure 4 represent gel images showing LDR products 
and an eleclropherogram of these multiplexed I.DRa. No product 
was seen when the sample lacked any lemplate in the reaction 
(see lane (i). When 10 nM wild-iyix- template (Gl2) was added 
into the reaction mixture, a signal indicating OK C C, and C.T 
mismatched ligation producis was observed (see lane 5). The gel 
band produced from the mutant allele (matched ligation product) 
was distinguished from the noise (ntismalched ligation product) 
at a SNR > 2 with a mutant to wild-type ratio of I: UK) (compare 
lanes 2 and 5) The high fidelity of LDR in distinguishing mutani 
target (complete match between discriminating primer and target) 
in Ihe presence uf a high molar excess uf wild-type sequences is 
due to the ability of the thermo.slable liga.se lo rapidly dissociate 
from substrates containing mismatches. Even li a misligation event 
does occur at an early LDR cycle, the product does not undergo 
subsequent amplification as is the case in PCR. Therefore, m 
contrast lo allele-specific PCR, PCR/LDR does not selecUvely 
amplify low-level polymerase errors and, hence, reduces the 
chance of false positives. Rgxire 4C shows Ihe difference in 
product yield and ligation fidelity between Ith ligase and Taq 
ligaac. Taif ligase showed a slighlty larger product yield compared 
to Tth ligase with no significant difference in ligation fidelity 

Hybridization Stringency Using LDR Buffer. SSPE or 
sodium chloride - sodium citrate ISHCj buffer along with SDS has 
been conventionally used as a hybridiiation buffer for Soulhcm 
and Northern blots in order lo Increase hybridization stringency. 
The use of such hybridizatJon buffers following IJJR would require 
incorporation of additional microchannels, niicrofluidic mUers, 
microvalves, or perhaps DNA extraction chambers prior lo 
hybridizalion since the composition of the IDR buffer (see 
Experimental Section) is differeni from that of die hybridization 
buffer, which may adversely affect hybridization stringency. We 
investigated the effects of the buffer composition on the hybridiza- 
tion (.see Figure 5), The compaUbihty of the 1J)R buffer vnlh the 
low-density zip code universal array was evaluated lo determine 
whether posl-lXlR modng with the appropriate hybridization buffer 
would be necessary, A LDR was performed using the reference 
cycler and discriminating primer to delect wild-type lemplate G 12 
and a common pniner, which should produce a matched LDR 
product that has a complementary sequence lo zip code probe 1 
Afler the reaction, Uie product solution was diluted wiih 5x 
SSPF-0.1% SDS buffer al a volumetric ratio of 50:50 (see Figure 
5A). On the other hand, (he same product .solution was dikiied 
with the I.DR cocktnil at a volumetric ratio of 50:50 (.see Figure 
5H) lu produce Ihe same concentration of solution complementary 
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sequences Tlie solutions were hyrtroslalJcalty moved through the 
microarray chip al a volumetric (low rale of l.H/iL/niin for 5 min 
set to S5 "C. After hybridization, the chip was rinsed with wash 
buffer and imaged using the near-IR scaJiner. No significant 
differenceti in hybridization stringency were observed between 



the IDR buffer and the hybridization buffer, T>ie need for high 
stringency in the hybridiMtion is relaxed using the universal array 
formal employed herein, since iherc is large sequence variability 
between different elements (zip code probes) of the array and 
the mutation screening assay is decoupled from hybridization, 

AnalyliQal Chemistry, Vol 77, No 10. May IS, 2005 J251 



(A) 



G'ZD 3. 



01? 



(B) 



(C) 



J •two 
J! 



-GAT- 
— C TA- 
TA rnxcn 



-O^T- 
-CCA- 



•DV 3 
S 




C A mtsmatQh 














HI lOi 


I 1 




»(l lOOl 


J II 5001 





Pixel 



:<Tao Igaee 




"0 "0(1 ,506 ,,oofl cwnrolj 

Ratio of matclwd to mitmacbed ternplate 

°' ^^^"^ "'W-typ. DNAJ wHh dlHsrenl ratios pi mutant lo wiW-ty« ssguences (Al Left 

sctwmalto diagram of mutani G12D and normal G12 doubls-etrandsd lamplates wilh the tt,rae dlKrlminatlna (D A a^ V)^^7^^ ioi^n 
pnma, u,eo ,n the LDR: ngt,l, ftuorascanc, images o. tha on-chip LDR p^Jucts analyzed el^rophoXa ^u^ng a 5^% po^a^^ 
matrix Lana 1, 1 nM mutant In 10 nM wlW-lype template (1 10); lana 2, 0 1 nM mutant In 10 nM wtjO-IVM tamo late (1 lOOi tan„?i n n, ?m 

n,Z T *■ T^' ™ T*""" '^"^ ^' Fl"<""«nce was Integrated over the indltited area7in TO ^na^e 

integration soltwaie (C) Comparison In the relative IDR product /told and ligation fidelity OeLen rap and m llgaae enzymes 



Detecting K roi Muwtinna Using the PC/PDMS/PMMA 
Microfluidic Assembty, As discussed previously, ilie mutation 
''iijiial from maichf d pruduci was dislinijuishert fnini noise (signal 
3252 ApBlylmal Chemisfry, Vol 77. No 10 May 15 2005 



from mismaiched product) at a sensitivity level of MM using gel 
eleclTiiphoresis, However, the allelic composition of the specific 
p<iint mutation could not be idenufieti when unknown lemplaies 
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After Ihe hybndizalion. the chip was rinsed with a wash buffer and imaged using the near-IR scanner » » o' i o ^umin 



are examined using gel electrophoresis because all of the 
discTiminatinx pnmers have the same mc (47-nci). Providing 
diflercnces in primer lenifth for the discriminating primers could 
make Ihe Identification possible,-'^* although gel elecu ophoresis 
is a time-consuming step and is lijnited in the number of mutaUoiis 
and their allelic composiUon thai can be simultaneously analyzed. 
Alternatively, the identification of successful IDRs can be carried 
out using a microarray by displaying positions of fluorescence 
si^atures generated on a soHd support where known probes (zip 
codes) are tethered. 

As demon sbaled in KiRure 3. the larger the Ifow rate of sample 
through the CFI.DR device, the smaller Ihe amount of product 
generated dut- to rtducxd ligation time However, larger flow rales 
can accelerate hybridizalion kinetics reducing analysis time 
Therefore it was necessary to balance processing speed *iih 
delccUon sensitivity, both of which depend on the volume flow 
rate at which Ihe biochip was operated. The hybridization assay 
was performed at a flow rate of 1 8 «l./min using the biochip in 
order to detect the presence of low-abundant Gl 2D mutations in 
the presence of a ma;oriiy of wild-type G12 sequences (see Figure 
6), This volume flow rate (ligaMoii time, 15 s) was chosen since 
it provided adequate li)R product lor analysis (see Figure :i). 



Wtien only wild-iype templates G12 were added into the reaction 
cocktail with one discriminating primer, which was specific for 
the G12 sequence, and one common primer, only C:G matched 
products were produced as seen by the corresponding signal 
generated al zip code I (I, see plot 1 in Figure 6B). When ail 
four discriminating primers, which can detect alleles G 12, G 1 2D, 
G12A, and G12V, were used with G12. small amounts of mis- 
matched products {C:A, C:C, and CT mismatches) were produced 
as well as a vast majority of C:G matched products. The 
mismatched products hybridized lo their complementary zip code 
probes (2, see graph 2 in Figure fiB), providing background noise 
when attempting to detect target mutations. When 1 nM K-ras 
G12D mutant was added into the reaction mixture along with 10 
nM G12, large amounts of T:A matched LDR products were 
generated and hybridized to Ihe appropriate zip code probes (zip 
code J) {3, see graph 3 in Fig\ire 6B) Tire signal intensity was 
well above background produced from C:A mismatched products 
(compare graph 3 lo graph 2 for zip code probe 3 in Figure 6B). 
Tlie higher signals at zip code probes 5 and U were produced 
by generation of T.C and TT mismatched products due to the 
presence of the mutant G12D allele (compare graph 3 lo graph 2 
al zip code probe 5 and t\p code probe 1 1) Using our image 
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analysis software-, wf intetrraicd ihe total (luDrcsccnce counls at 
zip code 3 (mutation signal, T A match) and zip cnde .'i (back- 
ground generated (rum mtsligatiun, C.C + TC mismatch) nl a Ifvel 
of MOO inulatiun to wild-type sequences (4, see graph 1 in Figure 
6B). This analysis provided a counl number of 520 'Si'i for zip 
code 3 (corrcelcd for autofluorescence from PMMA substrate at 
locations ot (he array not bearing dp code probes) and 21 1 823 
counts for 7ip code 5 (corrected fur PMMA autofluorescence). 
TTiese numbers provided a signal-lobackground nitiu of ~2,5 at 
the muunt to wid-type ratio of 1:100. Implementing longer 
hybridization times (>5 min, i.o , lower volume flow rates) did 
not seem to enhance the relative deleclion sensitivity of these 
mulations because the background due to die mismalched 
products increased at Ihe same level as the matched ligaUon 
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prnducis. The lime required for scanning the entire hybridisation 
miernrhamber (500^m wide x fi.7 mm long) was < 10,3 min with 
a step resolution of 25.4 M<n. A 50 8-/jm step resolution, which 
can complete the scanning in 2 6 min, also provided welkesolved 
images because of the low-density array fonnal used herein. 

CONCLUSIONS 

We have fabricated a now-through biochip that consisted of 
two different materials, PC and PMMA, for the delecUon of low- 
abundant DNA mutations in gene fragments (K-ras) that can^ 
point mutations with high diagnostic value fur colorectal cancers. 
The microchips possessed discrete functions, i.e., Ihe PC chip 
for continuous-Oow LDR and the PMMA chip for universal zip 
ctide array readout. The physiochemical properties of these 



materials (high g)ass transition temperature for PC and lower 
fluorescence background and minima) nonspecific DNA adsorp- 
tion for PMMA) matched the operational needs for each chip. 
The two microchips were assembled in a three^limcnsional 
architecture using an mterconnect, which was fabricated by laser 
drilling microchannels and included a microgasket fabricated from 
an elastomer (PDMS). The inlegrated biothip was micromanu- 
fattured using several diflerent techniques, such as hut cmbussing 
(or producing the intraihip (luidic nctworit and laser ablation for 
producing the interchip (luidics. Our cxprnments indicated the 
ability lo delect one mutant sequence in 100 normal sequences 
using this inlegrated device Tlie miniaturiied reaction channel 
and the continuous-flow (iperatinn of the LDR microthermaJ cycler 
accelerated the reaction primarily Uirough lis enhanced thermal 
management capabilities. The zip code array constructed in ihe 
microtluidic channel displayed improved hybridization kinetics 
compared to convenDona) nonflow formats due to enhanced mass 
transport lo its surface ajid minimal diffusional distances. Because 
of these attributes, the assay could be carried out rapidly 6,5 
niin for on-chip LDR, 10 min for washing, and 2.6 mln for scanning 
(-IW I min total). This is a signi(ic4nt reduction in processing 
lime when compared lo previous work where all of these steps 
were earned out using conventional instrument platforms: 95 min 
for l.f)K. 15-min preincubation, 120 min for hybridization. 10 min 
for washing, and 30 min for imaging (total processing time -270 
min)," In addiiion lo enhanced processing speed, the reagenl 



volume required for the microchip fonnat was reduced > 10-fold 
compared to Ihe conventional formal. Our biochip could be 
manufactured at relatively low cost (~$0,33/chip, materials cost 
only) due lo the use of replication technology and polymer parts 
and contained only passive elements, a particularly attractive 
formal for clinical applications where diapiisable-type devices are 
a necessity to eliminate false positives arising from carryover 
effects. We are itirrenUy working im the incorijoralion of a primary 
PtR amplification step into the stacked fluidic network as well as 
cell lysis and DNA extraction to provide a rapid and high- 
sensilivily PCK/l.llR/hybridijation biochip lo provide multiplexed 
processing lor massive parallel screening of single-nucleolide 
changes in cancer cells for clinical staging of solid tumors, 
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Universal DNA array 
detection of small 
insertions and deletions 
in BRCAl and BRCA2 
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Array-bsKd muucion detection methodology typically relies on 
direct hybridimtion of the fluorescently labeled query sequence to 
surface-bound oligonucleotide probes. These probes contain either 
small sequence variations or perfect-match sequence. The intensity 
of fluorescence bound to each oligonucleotide probe is intended to 
reveal which sequence is perfectly complcmentory tp the query 
sequence' However, these approaches have not always been suc- 
cessful, especially for detection of small frameshrft mutations. Here 
we describe a multiplex assay to detect small insertions and dele- 
lions by using a modified PCR to evenly amplify each amplicon 
(PCR/PCR)', followed by ligase detection reaction (IDK)'. 
Mutations were identified by screening reaaion products with a 
universal DNA microarray', which uncouples mutation detection 
from array hybridiiation and provides for high sensitivity. Using the 
three BRCM and BRCA2 founder mulationi in the Ashkenaii 
lewish population {BRCAI 185delAG; BRCM 5382in$C; BRCA2 
6 1 74delT)> as a model system, the assay readily detected these muta- 
tions in multiplexed reactions. Our results demonstrate thai univer- 
sal microarray analysis of PCRypCR/LDR' products permits rapid 
identification of small insertion and deletion mutations in the con- 
text of both clinical diagnosis and population studies, 

Wc have developed this method as an alternative to direct 
prohc/targel hybridization, which when u.sed to study mutations in 
cxon 1 1 o( 8RCA)', failed to delect ihe presence of the I l28insA 
muiatiiin. ln.serrion and deletion mutations in p5} also proved 
iffraciory (o dclcclion by direct hybridization; none of five 
frameshift nmiaiioiH in enons 2-il of fi5J were dctecled". These 
inaccuracies m.iy result from disruption of secondary structure: per- 
fect match sequence may assume a secondary structure that is elim- 
inated in vanani sequences. This structural change may lead to 
bmiiinfi of variant t.irgei tu perfcti match probe with higher binding 
affinity than true perfect-match targei*. Another consequence of 
dircci hybridization is ihc formation of stable duplexes by looping 
out of noncomplemeniary sequence during hybridization. Rither of 
these illegitimate hybridizations could produce false negative signals 
on an array 

Direct hybridization of mutation-containing target .sequence to 
sequences on ihc array has an .idditional shortcoming: Ihe difficulty 
of simultaneously as.saying sequence tracts with localized regions of 



high G»C and A»T cnntcnl*. This may also lead to false negative sig- 
nals. An alternative array detection scheme based on single- 
nucleotide extension also fails to delect slippage of mononucleotide 
tepeai sequences'. To overcome these deficits, we have developed a 
universal microarray wherein signal detection is completely uncou- 
pled from mutation identification. 

In screening more than 80 samples, we successfully dctecled the 
small insertions and deletions found in the three A.ihkenaz.i IIKCM 
and BRCA2 mutations, The exons in which these mutations arc 
located wereselemvely amplified from genomic DNA samples u.^ing 
a multiplex PCR reaciion (PCIVPCR/tfJR) designed to minimue 
primer-spccific differences in amplification efficiency, as outlined in 
figure I. Specific in.5ertion,i and deletions were first detected by 
multiplex LDR and subsequently separated by electrophoresis. A 
representative gel is shown in Figure 2A. Each lane contains compa 
table amounts of each LDR product, indicating equivalent amplifi- 
cation efficiency for each sample. All wild-type signals are blue in 
color (FAM label), whereas the mutant signals are green ( 1 1 1 
label). This arrangement enables rapid visual screening of samples 
for the presence or absence of mutations, 

Ligase detection reaction had been shown to be a sensitive assay 
for detecting point mutations'. To determine the limits of sen.Miivity 
of this assay for frameshift mutations, we performed a simulation 
experiment wherein a defined quantity of genomic DNA with a 
known mutation was diluted from 1:2 to 1:2(J with wild-type 
genomic DNA. The mixed DNA samples were amplified is before 
and then subjected to multiplex I.DR to detect the presence of the 
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Reu" 1. Outtin* o( tj» PCWPCR/LDH mtthtxl lor detection of 
mutalJona In aRCA 1 end BRCAZ. MuWplex ampllfle»tton o( the relevant 
eaptw K cwrieci out to en»ui» e<|ua< ampSflestlon of all prothicta, A 
Hmtteo number ol PCH cycles la perfotmed laMng gene-apecme 
primers, vvWi further roon* o( ampllflcatton primed from the unlver»ai 
aeqtienoea located at the extreme 6' eiKti of the PCR primera. UJR la 
then u«ed to detect both wild-type end mirtent veralona o( the queried 
aequence. The lltjrtlon oUaonudeotkiea hytwlttoe to both wHd-type and 
mutant producta, Put llgate only when Udh prlmora are pertectty 
matched with no gapa or overtopa. Produeti are electrophotetlcaity 
aeparsted or hytwldteed to a mtcroatrBV (or IdenttflcBtlon. 
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Pl9ur» Z, Multlplw LOR delactton of three apeclflc mutallona In BRCA1 and ef?C42 In a geHJ.aed wwy, (A) Repreeentatlve U)fl gel lor 
o??^ «»r ™r.n.^^''HT''^?*- "^TSr**^ .nd.|rt.,lln9 the dl,crimln.tlng upafream oll9on«ol«;«de. i«h either MM (lorwlW- 
L ■ s"'' by atWIng "talli" of different lenvtt). to tDR ollaonucleotWea, llBstlon product, can be dlitlnoolshed bated on 

op'ifTe ^flBl flffc/, Sn2'fl«'Sr^7S' 'T;'".^ product.^™ Identified at the Ide of tt., 9,1. Motent p^duct. Sntlf^d Jt the 
^« DnI b! o'™ .mS2il;„^„ ?f »" PO'l'Tt."'"'"*' mutetlon, were dihrted Into wild- 

^pe ONA before ampltfloatlon. LiBatlon product! from multiplex LOR are .hown for each dlluflon. BRCA 1 del AO, BRCA fina C, and BRCAl del 

only mutant .equence. urtr>g 6W Imol of each LOR o««nuSlo iJi^ rnSXn, 



mutation imdrr both single- and multiple-mutaiion conditions 
Figure 2B shows fh.il each of the three HKCA I and BRC/t^mulailons 
can he detected when diluted l;20. even in a multiples format also 
containing diluted wild ijTie I. PR oligonucleoiide controls and all 
three mutafioits 

1 he ability of PCR/PCR/l.Olt lo rapidly process samples in quan- 
tity was tested m a blind study using 249 PNA samples derived from 
Ashkenazi lewish individuals. Although the concentration of the 
DNA ranged from 15 ngul ' to J.UUO ngu' '. samples were assembled 
in 9 X 9 configurations wiihoul prejudice and pooled across rows 
and down columns. After PCK amplification, these pooled samples 
were then subjected to l.DRasdcscribeiJ above. We confirmed muia- 
lions by analyjing individual UNA samples residing at the intersec- 
tion of columns and rows found to be positive for mutation. This 
strategy allowed unique typing of each sample. Of the 249 samples. 
24R were correctly identified. The single sample that was incorrectly 
identified as wild type proved lo be too dilute and fell below the lim- 
ns of detection when mixed with nine other samples of higher con- 
cenlralion- The number of individual reactions carried oui was 
reduced from 249 to 96 by (his strategy (55 pooled samples and 41 
individual samples used for confirmation). 

.Since gel electrophoresis limits the number of mutations that can 
be simullaneuusly analyzed, analysis of the LDR prodiicis was trans- 
ferred to a zip code-based microarray system. Zip-code sequences 
consul of 24 bases that are assembled from a set of 36 leirainers. 
Koch (eiramer differs from the others by at least two bases and is nei- 
ther self-complementary, nor complementary to any other tetramer. 
Similarly, each zip code differs from the others by at least three 
tetramer units. 'I'he end result of this design are sequences that have 
comparable behavior in terms of the thermodynamics and kinetics 
of hybridization, while simultaneously maintaining distinct chemi- 
cal identities thai prevent cross-hybridiiatlon. Each zip code 
becomes associated with a particular mutation when its complement 
IS attached to the nonrcactive J' end of a common I.DK oligonu- 
cleotide that IS specific for a particular mutation. As LDR is a specif- 
ic reaction'"". Iigaiioii products fusing the fluorescent- and zip-code 
complemeni-containing oligomitleotidf! are produced only in the 
presence of a template sequence that is an exaci match at the ligation 
lunctinn .Since the zip codes reside at defined addresses on the 
inaioanav surface, eaih zip-code complement will direct specific 
wild tvpe or mutant I.DR products to a unique addres.s on the array. 

5«2 



Hence, zip-code hybridiz.aiion on an array allows complete physical 
separation of products and obviates the need for gel cleclrophorcsn. 

Preliminary microarray experiments using primers designed in 
ihe gel-based format (i.e., differentially labeled discriminating 
primers and identical common primers) demonstrated lhat wild- 
type and muiant versions of the three HRCA sequences were readily 
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figure 3. LOR detection of three specific mutations In BRCA i and 
BRCAi on an addrenable unlvereal microarray. The diagram at 
upper left ehowe the aaalsnment of each control and mutant 
sequence lo specific a<fdrasses on the array surface. Control 
sisnals are directed to the upper three addresses; mutant signals 
are assigned to the lower three, The upper right Image shows signal 
produced by a wtld-type ONA. Left panel: representative 
hybrldliatlons for Indlvldue) ONA samples. Right panel; 
representative hybridltatlons for each mutation using pooled 
samples of ONA from AshKenail Individuals. The mutations are 
Identified on ih» autreme rfflht. 
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T8bl» 1. PCR and LDR ollgonucleolldaa' 
OKgonuoleotWe description 
PCR primen 



Univaraai pnnw A (Uil A) 
Univsrsal primer B (Unl B) 
BRCA 1 eKon 2 (orwara 
BRCA 1 axon 2 ravarso 
BRCA 1 won 20 (orward 
eflC/t; enor 20 raverse 
BffCAi axon 1 1 lorward 
QHCA? axon 1 1 reverse 



Saquanca 



LDR gol-bSHd a»sy 



Dtacrtmtoollns o«gonocl»otJd»»: 

BRCAJ ex 2 wild-typo position 185 
BRCA t ax 2 position 1 65 dsl AG mutation 
BRCA 7 ex 20 wiW-type position 5382 
BRCA I ex 20 position 5382 ms C mutation 
e«CA? en 1 1 wild- type position 61 71 
BRCA! ex 1 1 posiion 61 7i) del T mutation 

LDR Common ollgonucltotldas: 

BRCA1 ex 2 position 185 
BRCA r 8« 20 posilion 6382 
BRCA2 ox 1 1 position 61 7<) 

LDR mltiroarray assay 

Olscrtmlnatlnd ollsonucleatldes: 

BRCAI ex 2 control 

BRCAI ex2 position 1 85 del AO mutation 
SflCAt ex 20 control 

SRCA f ex 20 position 6382 ms C mutation 
BRCA2 m 1 1 control 

BRCA2 ex 1 1 position 61 ?■( del T mutation 

Common ollgonuclaotldes for controls: 

BRCA 1 Bxon 2 control t cZIp 1 

BRCA t axon 20 control t c^ip 2 
BRCA2 axon 1 1 control ♦ cZip 3 

Common ollflonuclaotldes (or mutatlona; 

BRCA 1 axon 2 position 1 86 t cZip 9 

BRCA I axon 20 position 5382 <■ cZip 10 
BRCA2 axon 1 1 position 61 74 ♦ cZip 1 1 



S'-ggagcacsctatcccgttaaac- 3' 
5 'cgclgccaactaccacacatg-S' 
5'-Uni A-TCATTGOAACAGAAAGAAATGGATTTATC-S' 
5'-Uni8-TCTTCCCTAQTATGTAAaQTCAATTCTGTTC-3 
5'-Uni A-ACTTCCATTGAAGGAAGCTTCTCTTTC.3' 
5'-Unl B-ATCTCTGCAAAQQGGAGTGQAATAC-3' 
5'-Uni A-CAAAAWQTCTGGAnGGAGAAAQTnC-3' 
6'-Unl B-rTGGAAAAQACTTGCTTQGTACTATCTTC-3 



5-Fam-BaCATTAATGCTATGCAGAAAATCTTAOAG-3 
5-Tet-QTCATTAATGCTATGCAGAAAATCTTAQ-3' 
5-Fam-CCAAAQCGAQCAAGAQAATCC-3' 
5'-Tel-a»CAAAGCGAGCAAOAGAATCCC-3' 
5'.Fam-oaCrrGTGGGATTnTAQCACAQCAAGT-3' 
5 ■TBl-TACTTGTGGGATTTrTAGCACAGCAAG-3' 

5'-P-TGTOCCATCTGCTAAGTCAGCACAAAC-S-3' 
5'-P-CAGGACAGAA/\GGTMAGCTCCCTC-B-3' 
b •P-GaA*AATCTGTCCAGGTATCAaAT-B-3' 



5'-Cy3-TGCATAQGAQATAATCATAGGAATCC-3' 
5'Cy3-aTCAnAATGCTATGCAGAAAATCnAG-3' 
5-Cy3-CCTCTGACTTCAAAATCATGCTG-3' 
5'-Cy3-CAAAQCGAGCAAQAOAATCCC-3' 
5'-Cy3-CTTCCCTATACTACATTTACATATATCTGAAG-3' 
5'-Cy3-TACT7GTCGGATTrTTAGCACAGCAAG-3' 

S'-P-CAMTTAATACACTCTTQTGCTOACTTACCA-cgc 

asattngogclagamcaa-B-3' 
b'-P-AAAQAAACCAAACACAACCCATCAG-tlcgccglC 

g1gtagoctmcaa-6-3' 
5'-P-TrTCCAAACTAACATCACAAGQTGATATTT-ocgtaa 

gcccgtatggcagatcaa-B'3' 

5'-P-TGTCCCATCTGGTAAGTCAGCACAAAC-oatcgtco 

ctncaatgggatcaa-B-3' 
S'-P-CAQGACAGAAAGGTAAAQCTCCCTC-caaggcacg 

tcccagacgcaicaa-B-3' 
S'-P-GGAAAATCTGTCCAGGTATCAGAT-goacgggagctg 

acgacgtgtcaa-B-3' 



•UDH Oligonucleotides sr. dapioled m ttw 5' to 3' orientation. In m cases, upper-case Bases indicate genomic 
sequence, lower case Pas« indicate nongenomic sequence (eftner comolBmentarv up codes or universal 
primers); bold lower-csse Mses indicate nongenpmic Pases adctoa lo control me size o( the l.nal oroduct P 
pnoepnaro; a. oiocking grouo. cZip. comoiemenia<y ripcode, ' 

detected nn ihe array (data not .shown). In this version, both types of 
sequfnce.s were directed to the same addresses (e.g., BRCAI 
IS.ldelAG and HRCAI 185 wild lype were both directed lo /.ip code 
I ) Although this formal proved succes.sfiil, PCR/PCK/LUR has the 
puienlial ol delecting hundreds of muiatioiis in a singie-tube reac- 
tion, and ibis design docj noi make optimal use of the array for such 
larjje-vcalc mulaimn dcieciion experiments. In order lo establish the 
experimental paradigm for fulure studies, we expanded the address- 
able lorinal by choosing a sequence in each of the amplicons to use 
as a control I.UR product. I'hus, rather than require detection of 
wild-type sequences for each mutant LDR product, ihis formal uses 
a single product lu serve as a positive control for multiple differenl 
sequence variants within an amplicon. One advantage of this formal 
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is that it minimizes oligonuclcunde syn- 
thesis; additionally, the use of each of ihr 
64 positions is maximiwd Since the 
number of l.DR products that lan he 
detected at a single address is limited hy 
the number of currently available spec 
traily separated fluorescent labels, conhn- 
ing the control lo a specified region of ihe 
array permits one more sequence variant 
lo be detected at each remaining address 
In the experiment! described here, con- 
trol and mutant LDR prnduct.s for a 
queried posilion were directed to six sepa 
_ rale addresses on a l)4-posiiion array. As j 
control for reproduclbiliry, each address 
was spotted in quadruplicate. 

All ihrec frameshift mutations were 
delectable by hybridi?jition to (he univer 
sal array (Fig. .tj. Only combinations of 
Ihe SIX possible addresses were visible fol 
lowing hybndizaiion, and no addiliDiial 
signals were detected at any of ihe unused 
addresses. Thus, /.ip-code hybridi/,aiiiin 
is very specific. C:ontrol and miitani sig- 
nals were clearly preseni for each of ihe 
muiationi derived from samples ol UNA 
from single individuals (Fig, .1. lefi 
panel). Pooled UNA used in an.ilv/ing 
the 249 DNA samples described above 
produced signals for muiatinns identical 
to those found in the gel-based assay d ig 
-V right panel). In each case, the array 
reproduced the result of the gel 

In addition to allowing rapid screen 
mg of multiple individuals, PCRyi.UR can 
also identify individuals with multiple 
mutations, A previous blind siudy per 
formed using 144 tumor sainplei success- 
fully delccled ail mutations out of 19 pos- 
sible closely clustered single-base muta- 
tions that occur at codons 12. 13, and 6 1 
of ihe K-riis gene^ In addition, Ht.R/lUR 
successfully detected length polynior- 
phisins resulting from small mserligns 
and deletions in both diniicfeoiide repeals 
and mononucleotide repeats in ihe AP(. 
(APCI1307K1 and the TGF-p type II 
receptor genes"". Results from ihese 
studies were corroborated by and exceed- 
ed detection by direct sequencing. 

Recently, PCR/LDK was used in com 
binalion with the universal DNA array ui 
detect K-raj mutalions in tumor and cell line DNA', Ibgcther, these 
approaches can provide the high sensitivity required lu deieci low- 
level mutalions (unavailable in other arrays) and the speed required 
to rapidly as,say large numbers ofclmical samples. I ndced, our cxper 
iment analyzing mutations in pooled DNA samples cmphasi.'es ihe 
utility of this approach in screening large populaiions. The abiliij' \o 
delect mutations in pooled samples will facilitate large-scale corrcl.i 
five siudies, where unclassified polymorphisms are compared in dh 
eased and healthy cohorts lo determine if particular polymorphisms 
coniiibute lo developmeni of disease". Finally, since multiple luci 
can be simultaneously assayed, ii may be possible lo invesiigaic 
gcnciic inieractions by analyzing combinailons of alleles in both dis- 
eased and unaffected cohons 
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Experimental protocol 

PCR/PCH •mplificalioi), Pnlymtrasc chain rcjcllon wbs carried o»i js ■ sin- 
glc-llibc. mulliplen rcMiion lo iirnullanrously amplify URCAI cjons 3 and 
20 and BHCaj txtm 1 1 Genomic PNA w«j f«r»c\cd from blood wmplfs ol 
Athkcnjii Iwiih individual! and amplified in a 2i p] rcjclion mixlurc contain- 
ing 100 nsol UNA. 4U0MMof cacti dNTK l« PCR buffer II 1 10 mM Tris-HCI 
pH B .\ ai im:. 50 mM KCI) suppkmenled wilh < mM MgCl,, 1 U AmpliTaq 
Cold, and I pmol ol cacli scne-spcclflc primo bearing either universal primer A 
or B i>n Ihe 5' ends li« Table I for Majuenttil The readiun wai Overlaid ivilh 
mineral oil and preincubalcd for 10 min il 95'<:. Amplificalion was perfnrn<ed 
for 1 5 cvclei ai lol)<jv<s H'C lor 1 5 i, 65 X. lor I min A second 25 (j) aliqnni of 
Ihe rcnciion muiiurc was added lllrough (he mineral oil cnniaining 35 pmol 
each of universal primerj A and K. t.ycling was repealed using 55"t: annealing 
lemp^-raiure The rcaciion was nexi ijigesied wiih a 3 (jl solution of I mg ml ' 
proinnaw K-5flmM FDTAai 55*r:ior 10 mm, Promna.se K was elimiiiaied by 
a linal mcuhalion al ■)U"C for I 5 mm 

To lesi mulaiiun deleclion in pooled diluled samples, simuljlion experi- 
ment were performed. UNA lample.s wilh known mulalions were diluled 
la. 1:5. 1 lU. and 1:20 with wild lype fJNA before PGR amplificalion, and 
then iubiecied lo I.DR. For pooling blinded Ashkenazi Jewish DNA samples, 
the lubes coniaminB ihe DNAs were assembled into a 9 x 9 gridded formal 
and ahquoii from each lube were combined across (he rows and ihen down 
(he columns lo produce one (ube of combined UNA for each row and ta<:h 
column. The pooled UNA was then subjected lo ampli/icaiion and LDR, See 
Table ( fo( (he PCRand LDR primer sequences. 

LDR condiriona. Ohgnnuclcoiide syiiihciis and purification were carried 
oui as described' Tih DNA liease was overproduced and purified as described 
elsewherr"^!', LUR was performed in a 20 nl reaclion eon(»ining 500 fmol of 
each primer (or as spec ilied above for specific applications), 2 ill of amplified 
and 20 niM Iru-HCI. pM 7.6; 10 mM MgCj,; 100 m.M KCI; 10 mM 
dilhimhreiiol. I niM NAD The reaclion was heated (o 94"i: for 90 i before 
.Hiding 23 lnio| iif Tih DMA lipase, and then lub/ecled lo 20 i7cles of 15 s al 
•iA'i rfnd 1 nun ,ii ti5-f: Kleclrnplwredc sepiralion was performed al 1400 
volls ii.ins « .VI urej KifttpolvJcrylamuleijelsand an AUI JV3 DNA sequencer 
llooriiLcnl li(;,«iion products were armlyyfd and qgandCled usinRlhe rtfll Cent 
Scan 672 sndwarc 

Universal r>NA microarray. Microarrai-i were processed and s|>olled as 
described', iismj j Pi.svs5500 robot eiiilnscd in q humidilv chamber 
((..arlesianTechnokiBies. Irvine, (~.A) hrielly. LDR reacdons were hybndlied ui 
32 Ml konlaimns .100 mM 2. 1 N moiphohnol ethjnesulfiinic acid, pH 6 0, 10 
mM .VIgCI, 0 1% SDS at 65'C for I h in a roiadnj chamber. After washing in 
jyOmM bicine, pHUO. 10 m,M .VlgClj.O 1% SD.S lor IOmina(35°C, (he array 



was imaged on an Olympus Provis AX70 microscope uiinf a 100 W merun v 
burner, a le»ai Ked tihcr cube, and a Princeron Insirumenis T1-K5I2/<:(.U 
camera The 16 bil gray scale images were capiurcd using Mela.Vlorph 
Imaging System (Uinivetsal Imaging, West Chesict, PA) and rcwaled to nmrc 
narrowly bracket ihe IIJH signal before conversion lo H bil gray scale Ihe S 
bil images were colored using Adobe Phoioihop lo render the Cy l signal red 
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ABSTRACT 

Purpose: Molecular profiling of dterstlonj associited 
wllh lung c»n««r liolds the promise to define clinical param- 
eters such as response to treatment or survival. Because 
<S% of vmall cell lung cancers and <30% of non-small cell 
lung cancers are surgically resectable, molecular analysis 
will perforce rely on routinely available clinical samples 
such as biopsies. Identifying tumor mutarfons In such sam- 
ples will require a sensitive and robust technology to over- 
come signal from encess amounts of normal DNA. 

Experimental Design; p5,1 mutation status was assessed 
from the DNA and RNA of biopsies collected prospectively 
from 83 patients tvllh lung canter, Biopsies were obtained 
cither by conventional bronchoscopy or computed tomogra- 
phy-guided percutaneous biopsy. Matched surgical speci- 
mens were available for 22 patients. Three assays were used: 
direct sequencing; a functional assay in yeast; and a newly 
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developed PCR/ltgase detection reaction/Universal DNA ar- 
ray assay. 

KesHlis; Using the functional assay, p53 mutation was 
found in 62% of biopsies and 64% of surgical specimens 
with a concordance of 80%. The sensitivity of the functional 
assay was determined to be 5%, Direct sequencing con- 
Ormed mutations in 92% of surgical specimens but In only 
78% of biopsies. The DNA array confirmed 100% of muta- 
tions In both biopsies and surgical specimens. Using this 
newly developed DNA array, we demonstrate the feasibility 
of directly Identifying p53 mutations in clinical samples 
containing <5% of tumor cells. 

Conclusions; The versatility and sensitivity of this new 
array assay should allow additional development of muta- 
tion profiling arrays that could be applied to biological 
samples with a low tumor cell content such as bronchial 
aspirates, bronchoalveolar lavage fluid, or serum. 

INTRODUCTION 

Over the past 20 years, lung cancer has remained ihc 
leading cause of canccr-rclaled deaths in Ihc world, and ihc 
overall 5-year survival has remained unchanged over this lime ai 
an aby.tmal 15% (I, 2). Al present, clinical prognostic indicaioru 
such as Tumor-Nodc-Melaslasis staging classification or per- 
rormance status remain the mam parameters u.sed for ircatmcni 
dwisions, A major obstacle lo curative treaimenl of lung cuntcr 
IS the early onset of extrapulmonary dissemination Small cell 
lung cancers are almost never accessible to surgical rcswiion. 
whereas only 20-30% of non-small cell lung cancer palicnu 
presenting with apparently localized disease receive cither sur- 
gery as sole treaimenl or multimodalily treatment, including 
chemotherapy and/or radiotherapy with surgery (3). 

Lung cancer is the clinical expression of a disease repre- 
senting the end point ofa series of speciCic somatic geneltc and 
cpigcnelic changes thai precede the invasive tumor by many 
years (4), These changes include loss of heterozygosity at chro- 
mosomes 3p, 9p, I7p. microsatellile instability, p|6, and other 
tumor suppressor gene promoter melhylaiion, K-rat, and/or pSJ 
mutations. The use of these changes as a clonal marker to delect 
rare lumor cells in body fluids such a.s sputum, bronchoalveolai 
lavage, bronchial aspirales, biopsy, and scrum would be very 
promising for Ihc early diagnosis of lung cancers. However, ui 
(late, the potential prognostic, predictive, and therapeutic value 
of detecting these alterations has been disappointing, partly due 
to Ihc lack of power of a single alteration and partly due ui 
heterogeneity between the various assays, Furthcrnore, many ol 
the studies performed to dale have been rctrospeciive, u.sing 
Cither frozen tissue or paraPrtn-embcdded samples from surgical 
specimens. The use of these surgical specimens to screen for 
new molecular markers in either retrospective or prospective 
studies may be untniennonally biased because it tend,< to focu,', 
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on only a subssi of patients because: (a) most lung cancers ore 
unresectable, (ft) patients with resectable rumors have a better 
prognosis, and (c) patients with resectable cancer generally 
receive neoadjuvant chemotherapy before surgery. 

To mcei the challenge of molecular profiling of tumors, 
there is an urgent need lo develop routine molecular diagnostic 
procedures to manage small or heterogeneous samples such as 
biopsies, bronchial aspirates, bronchoalveolar lavage, or spu- 
lum. It IS equally urgent to develop sensitive assays able to 
overcome the sinall si7^ and low percentage of tumor cell 
content of these samples. Biopsies are a suitable material be- 
cause Ihcy arc routinely performed in every patient suspected to 
have a lung tumor 

Among the various potential markers, accurate detection of 
p53 mutations couitt be clinically meaningftil because this pro- 
tein pluys a key role in drug-Induced apoptosis. Consequently, 
p53 mutational status could influence tumor response to chem- 
otherapy, Funhermorc, pS} muwtions arc frequent and occur 
early in lung cuncer, making them attractive as markers for early 
detection nf tumor tells. The discordance in the literature con- 
cerning the clinical relevance of pS.l mutational status may be 
partly caused by dilTerent methods of analysis (J). Wc have 
recently established that the analysis of the central region of the 
gene (cxons 5- 8) mis.ses - 13% of mutations, with half of these 
mutations corresponding to null mutations (5). The correlation 
between p}J gene mutation and p53 protein aceuinulation in 
tumor cells is also only 70% based on studies analyzing the 
entire pS3 gene. This indicates that immunohistochemical anal- 
ysis IS not suflicienily sensitive Moreover, recent studies have 
emphaoiiicd the concept that p53 mutants may present a heteroge- 
neous behavior Only a specific subset ofpSS mutations could be 
of clinical value, and this subset could be ditTerent depending on 
the type of cancer or the trearmeni regimen used (6-11). 

We have developed a prospective program to establish 
routine DNA and RNA extraction of biopsy specimen at the 
lime ofdiagnosis. In this prospective study, we analyzed the pS3 
gene statu.s using two sensitive methodologies: the yeast func- 
tional assay originally developed by Dr Richard Iggo (12) and 
the l>CR/ligasc detection reaction (LDR)/Univcrs8l array devel- 
oped by Dr Francis Barany (13-15), We demonstrate that the 
yeast a,ssay is more sensitive than direct sequencing for detec- 
tion of p53 mutations in clinical specimens contaminated by a 
high proportion of stromal cells and can be used for routine 
analysis. Use of the PCR/LDR/Univcrsal array also achieves a 
throughput and sensitivity that cannot be achieved by other 
currently available technologies. 

MATERIALS AND METHODS 

Patients. A cohon of 2 10 consecutive patients wa.s pro- 
spectively evaluated for newly suspected lung cancer over a 
20-monlh period (June 2000 lo February 2002) m our chest 
jurgcry department. Fiber optic bronchoscopy was performed in 
all patienis. Nonsurgical biopsies were used as the diagnostic 
procedure in 170 palienui. Diagnostic material was obtained 
either by biopsy of an endobronchial lesion visualized during 
bronchoscopy or by computed tomography {CT)-guidcd pcrcu- 
Utneous biopsy when bronchoscopy was not coniributive. Dur- 
ing bronchoscopy, four biopsies were taken and fwed in alcohol, 



fonnalin, and acetic acid for diagnosis, and two additional 
biopsies were taken and snap- frozen in individual cryotubes in 
liquid nitrogen at the tune of endoscopy when the procedure was 
well tolerated (without respiratory intolerance, excessive cough, 
or bronchial bleeding). For CT-guided percutaneous biopsy! 
only one sample was taken and fixed in alcohol, formalin, and 
acetic acid, and a second biopsy was taken and snap-frozen at 
the time of CT scan, if well tolerated by the patient No addi- 
tional biopsy was performed for the purpose of this study, and 
all alcohol, forinalin, and acetic acid-fixcd and snap-frozen- 
paired biopsies were archived rn the Tenon Hospital pathology 
depanment. Among the 134 patients from whom smp-t'iof.m 
biopsies were obtained, the diagnosis of lung cancer could not 
be performed on alcohol, formalin, and acetic acid-fixed spec- 
imens in 28 cases, and the snap-froien-paired biopsies were 
used to avoid another diagnostic procedure for the patient 
Finally, frozen tissues from 106 patienis (86 obtained by bron- 
choscopy and 26 obwincd by CT-guided percutaneous biopsy) 
were the subject of the present study. 

This procedure did not increase the number of biopsies for 
investigative purposes and only used specimens already ac- 
quired for routine diagnosis, as recommended by Uie French 
governmental Agcncc Nahonalc d'Accrtdiiation cl d'Evaluation en 
Same in lus "Recommendations for tumor cryopreserved cell 
and tissue librancs for molecular analyses."' As recommended, 
patienis were informed that a pan of the pathological specimens 
could be u.sed for molecular analysis provided that a deliniiive 
pathological diagnosis was obtained on rormalin-rixed samples 
PathologtcBl Prscedurc, Snap-frozen biopsies, 1-3 ^1 
in diameter and stored at -80°C, were cut m a cryostai chilled 
to -30°C. To avoid cross-conuiminaiion between tissues, the 
razor was moved 0.5 cm afler each section was cut. In this way, 
a cryosiat razor was used lo cut 10-12 diftcrent specimens 
After use, the razor was washed with distilled water, cihanol 
dried, and exposed for 30 min to a UV bank before starling a 
new scries of sections. A first 5-^m slide was processed with 
Toluidine blue slam lo assess the tumor cell content (.Supple- 
mentary Figs. 1-7), If the slide contained at least 10% fuinor 
cells, 10-20 adjacent IO-)j.m frozen sections were cut and 
immediately placed in a cryolubc immersed in liquid nitrogen. 
Another slide was stained to check that the block still contained 
tumor cells. If the first frozen seclion slide did not contain 
tumor, a second or third section was cut deeper into the tissue 
block, and frozen slides were only prepared for molecular anal- 
ysis if this microscopic examination showed the presence of 
tumor. If ihree consecutive Toluidine blue stain-stained slides 
were negative, the sample was not used, and the second fro/cn 
sample wa.s accessed for similar processing Among the 106 
biopsies processed, 20 were eliminated because the biopsy was 
histologically negative for tumor celLs, one was eliminated be- 
cause It corresponded to a lung metastasis from a primary brcasi 
cancer, and 2 were eliminated because the (issue was too ne- 
crotic. A total of 83 samples was therefore processed for mo- 
lecular analysis (Table I). For 22 patienis from whom biopsies 
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were available, surgical spctitncns were also nvailsble leading 
10 a louil of 105 samples. The paihologLsi obtained Ihe samples 
within 40 60 min aller dcvasculanzalion of ific loheelomy or 
pnctimoneclomy, Hiviological eonlrol and seclionmg were per- 
lonncd as (ieseribcd above. The pathologist (M Antojnc) cla.s- 
•stlicd ihesc specimens scmiquaniKaiivcly: i if ii comalned 

0 -im ol' minor cells; ^ f ifii contained 35-50%, + + f ,rti 
conlamcri 50-75%: and + ^- + + if ii contained 75-100%. The 
WHO inlernaiional hisiologlcal classiricalion was used lo assess 
Ihe final palhological diagnosis. Specimens from 83 subjects 
were therefore studied in ibe present arrielc. 

Nucklc Acid Entractlon »nd Procesjlng. DNA and 
RNA extraction was performed simultaneously using the DNA/ 
RNA minikii (Qiagcn 14123). Genetic maierial from surgical 
specimens was resuspcndcd in either TE (10 mv Tns (pH 8 0) 

1 tiiM EDTAl (DMA) or water (RNA) in a final volume of 20 
and 25 (j,l, respectively. Oenelic material from biopsies was 
rcsuspended m a final volume of 10 (xl. The yield of RNA and 
DNA allowed multiple independeni PCR amplincalions for 
eiiher direct sequencing or functional p53 assay 

Reverse Tr»nscripfase-PCR ind PCR Analysis, Re- 
verse iraniicriplion of RNA was performed using 2 jil of RNA. 
The RNA was mcubalcd for 5 mm al 65°C before adding 18 
uniw of random primers (Invitrogcn), 100 un.t.s of the Super- 
script II reverse transcriptase (Invitrogcn), 10 ntM DTT, 40 units 
of Ihe RNase inhibitor, RNascOUT, and I 25 mm deoxynuclco. 
•iide triphosphate Tlie reaction was incubated for I h al 4'5°C m 
a final volume of 20 |,lI, After inaciivalion al 72»C for 3 mm, 2 
Hi of the cDNA prepataiion were used for PCR in a final 
volume of 20 ixl (I 25 units of error-free POJ polymerase 
(Stramgcnc). 0 5 m.« of each primer, 50 i^m deoxynucleosidc 
iriphosphalo, and 10% DMSO) The amplirication conditions 
were as follows 5 mm ai 94°C, ihcn 30 cycles of 30 s at 94"C 



30 s al 63°C, 2 mm al 74''r, followed by 10 mm a( 74»C (final 
extension step). Five nl of the produci were used for agarose gel 
analysis. For the ycasi assay, ihc 5'- and 3'-region of p53 cDNA 
was amplified separately For the 5 '-region, we used pho.spho- 
roihioate-modified primers P3 ( ATTTGATGCTCTCrCCCi- 
GACOATATTaAAsC, where s represents a phosphoroihioaic 
linkage) and PI 7 (GCCCiCCCATOCAGGAACTGTTACA- 
CAsT). For [he 3'-pan. we used P!6 (GCGATGGTCTGGC- 
CCCTCCTCAGCATCTTsA) and P4 ( ACCCTTTTTOGACT- 
TCAGGTGGCTGGAGTsG). The size of these two reverse 
transcnptase-PCR products was 6 1 1 and 569 bp, respectively 
For genomic DNA analysis, PCR wa.s pcrfonncd in a nnal 
volume of 25 ^1 (0625 umis of TaqGold polymerase (Applied 
BS), 0.2 HM ol each primers. 200 of each deoxynuclcoside 
iriphosphale, 4 mM MgCI,] The ampliftcaiion conditions were 
8.S follows: 10 mm al 95°C, then 30 s at 95T. 30 s at 60»C, 60 s 
81 72»C (35 cycles), and 10 mm al 72''C (final extension step) 
Pnmers for amplilicalion of genomic DNA have already been 
dcsenbcd previously (16) Five nl of the produci were used for 
agarose gel analysis DNAa were sequenced using ihe Big Dye 
Read reaction terminator kil (PE Biosyslems) and an ABI 3100 
genelic analyzer according to the manufacturer's mstruciions 

VeasI Assay, Transcriptional activation is ihc criiical 
biochemical function of p53, which underlies its tumor suppres- 
sor activity Mutant p53 proteins fail lo activate iranscripiion A 
yeast strain (ylO.397), defective for adenine synihesi.s because of 
a muialion m ils endogenous ADE2 gene but coniaming a 
second copy of the ADE2 open reading frame controlled by a 
p53 response pi'omolcr, has been developed. Because ADF2- 
mutani strains grown on low-adenme plates tiini red, ylG397 
colonies eonlaming mutant p53 arc red, whereas colonics con- 
taining wild-type p53 are white. For the assay, ihc yeast siia.n 
was cotransformcd with reverse iranscripiase-PCR-amplilicd 
p53 and a lincanzed expression vector. p53 cDNA is therefore 
cloned in the vector m yiw by homologous recombmaiion. To 
minimize mutations introduced dunng PCR, we used PIU DNA 
polymera.se (.Slratagene). a high-fidelity polytrwra.se In ihe orig- 
mal assay deschbcd by Flaman t-r at. (12), only one reverse 
transcnptase-PCR produci was amplified and transformed m the 
recipient yca,st The cutoff for mutation was established as 
>15% red colonics, indicating the presence of a p53 mulation 
(12). Although >70% of rod colonics are usually obtained lor 
tumors with a high oimor DNA content, ambiguous results may 
be observed for tumors with a lower tumor cell content or with 
highly heierogeneous rumor cells. We and other authors (17-19) 
have also observed thai ihe background of red colonies (false 
posittvel can be heterogeneous from one sample lo another 
leading to dirficulties defining a precise cutolT value. This 
heterogeneity was reproducible from one sample lo another 
suggesting that each sample of geneilc material could have an 
inherent behavior that could be due cither to the quality of Ihc 
starting malena), contaminating compounds affecting the pro- 
ccsstviry of ihc enzyme or both. Bearing ihis problem in mind 
Waridel ^ al (20) developed a spin functional analysis of 
separated alleles m yeast (FASAY), where the p53 cDNA is 
amphrled into two overlapping PCR fragments ihni arc mde- 
pendendy iransformcd m the recipient yeast with the appropriate 
vector The first fragment (P3.PI7) corresponds to residues 
52 -236, whereas the second fragment (P4-PI6) corresponds lo 
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residues 195-364, Because Ihere is only one mutarioiVp53 cDNA, 
Che main advantage of Ihis improvement is thai one PCR fragTueni' 
for each sample will lead to background colonics, whereas the other 
Iragmenl will lead lo red colonics if a miitalion is present. 

Recovery of p53 Plmmidi from Vet»I and DNA Se- 
quencing, For each sample yielding > 1 5% of red colonies 
the pooled plasmid DNA from 10 red yeast colonies was ex- 
iractcd and sequenced to make a final decision concerning 
muialian.s The plasmid DNA was sequenced u.sing the Big Dye 
Read reaction terminator kii (PE Biosysiems) and an ABI 3)00 
genetic analyztr according to the mBnufacturer's instnjctions 
Far samples with < 15% of red colonics, DNA from 10 red 
colonies was individually sequenced to distinguish true muta- 
tions from the background ol' PCR errors, 

PCR-UDR As»»y for p53 Mutalfoni. PCRyLDR/Uni- 
vcrsal Array assays were generally perfonned as described in 
Favis el and Gerry el al. (14). 

p53 c-xons 5- a were simultaneously smpliHed in smgle- 
lube reactions. Primer sequences, in 5'- to 3'-orientation, were 
35 follows' exon 5 forward * CTOITCACTTCiTGCCCT- 
GACnTC; exon S rcvenc = CCAOCTGCTCACCATCGCT- 
ATC; exon 6 forward - CCTCTOATTCCTCACTGATTGCT- 
CTf A, exon 6 reverse = COCCACTGACAACCACCtrrTAAC' 
exon 7 foward =■ OCCTCATCTTOOGCCTGTGTTATC' exon 7 
reverse ■ OTGCJATOGOTAGTAOTATGGAAOAAATC; exon 
8 fot^ard - GGACAOOTACXiACCTGATTTCCTTAC and 
exon S rcvct'ic = CGCTTCTTGTCCTGCTTGCTTAC, To en- 
sure amplification of all exons, PCR was performed by using 
primers containing a universal primer sequence al the 5 -ends, 
The Niiiial PCR reaction was performed as previously described 
(13. 15) with the following modilicaiioas. The 25-til PCR 
reaction mixture conwmed 3-5 ^1 of primary tumor DNA. alt 
four deoxynucleoside inphosphaics (400 h-m of each one)' 10 
mM Tns-HCI (pH 8 3), 50 mM KCI. 4 mw MgCl,. 0 625 units of 
AmpliTa*! Gold (PE Applied Biosystems, Inc, Norwalk. CT), 2 
pmol of gcnc-spccific primers containing a 5'-univcr3al se- 
quence for exons 5. 6, and S. and 4 pmol uf a similar primer for 
exon 7. The reaction was preincubaied for 10 min at 95°C, 
Ampliikaiion wa.s performed for 15 cycles as follows; 94''C for 
15 s and 65"C for I mm. A second 25-^1 aliquol of the reaction 
mixture, conisining 35 pmol of universal pnmer, was then 
added PCR was repeated for 25 cycles al an annealing temper- 
amrc o{ iyc for I mm. Ampliflcalion was verified by exam- 
ining the products on 3% agarose gel, Taq polymerase was 
maclivatcd by 3 cycles of freezing in dry ice. 

After a multiplex PCR amplification of the regions of 
interest, each mutation was simultaneously detected using a 
thenno.siable ligase that joins pairs of adjacent oligonucleotides 
complementary to the sequences of interest. Ligation occurs 
only when there is pcrfeci complementarity at the junction 
between the 5'.|luorescenl-labclcd upstream oligonucleotide, 
contammg the discriminating base for the muuition on the 



3'-end, and the adjacent downstream oligonucleotide, contain- 
ing a complementary zip code sequence on the 3'.end, The 
complete set uf LDR primers is described m Favis m af 
Ligation products arc distinguished on the basis of differential 
labeling and capture of the zip code complement on its cognate 
Kip code address on an universal an-ay 

UDR reactions were carried oul in a 20-|j.l mixture con- 
taining 20 mM Tris-HCI (pH 7 6), 10 mxi MgClj, 100 mM KCI 
10 mM DTT, I mvi NAD' , 25 n.M (500 fmol) of the detecting 
pnmers, 2 nl of PCR product, and 25 fmo) of Tlh DNA ligasc 
Liga,ses were overproduced and purified as described previously 
(21. 22). LDR reactions were incubated for 5 mm al 95°C and 
were then thermally cycled for 20 cycles of 30 s al 95''C and 4 
mm at 64''C, Quality control for LDR was performed using a 
synthetic lemplsle for each mulation to lest the ability of the full 
mix of upper or lower ligation primers to produce the cxpecicd 
specific signal on the DNA microarrsy. 

Preparation and hybridiz,ation were performed as previ- 
ously described (13, 14), except that hybridization was carried 
out m the presence of 100 p-g/ml sheared calf ihymu.s DNA, 
Briefly, 20 (a! of the LDR reaction were diluted wiih 20 nl of 
2,0X hybridization buffer to produce a final buffer (;oncentra- 
non of 300 mM 4-morpholinecthanesulfonic acid (pH 6 0) 10 
mv. MgCI,, and 0.1% SOS Ibat was incubated for 5 mm at 94°C 
before loading in the chips. The arrays were placed in humidi- 
fied culture tubes and incubated for I h at ftS'C and 20 rpm m 
a rotating hybndization oven After hybridization, the anays 
were washed in 300 mM bicine (pH 8 0), 10 mM MgCI,, and 
0,1% SDS for 10 mm at 6yc. Arrays were reused Ihree 'limes 
and were stripped between uses by submerging for 1 min m a 
solution of boiling 100 mM bicinc/0 1% SDS Stripped arrays 
were rinsed in nanopure water, excess water was removed usmg 
forced air, and the arrays were stored in slide boxes al room 
Icmperaiurc, 
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RESULTS 

The clinical and histological charactcnsiics of 83 patients 
with lung cancer are shown in Table I The distribution of the 
various histological types is m agreement with recent daia 
concerning the distribution of lung cancer in France, indicating 
thai no recruitment bias occurred during this prospective stady (23). 

Using total RNA extracted from cither the biopsy or the 
tumor sample, reverse Iranscriptase-PCR ampliHcaiion and FA- 
SAY analysis of all 105 samples (100%) were successful (Sup- 
plementary Fig. Ij, FASAY analysis for the detection of p53 
mutations has been extensively described, but most of these 
studies used a first generation assay with only one PCR product 
corresponding to residues 52-364, The cutoff value of red 
colonies for a positive result is usually arbirranly defined be- 
tween 10 and 30% (24-261 In Ihc present study, we first used 
a 1 5% cutoff value, leading to the detection ufp53 mutntions m 
44 of 83 biopsies and 14 of 23 tumors Direct sequencing of 
pooled rescued plasmid DNA from yeasi led lo the idcniificaiion 
ol the p33 mutation in 100% of cases (Supplementary Figs I 7) 
In the split methodology, the p53 gene is cloned into two 
Iragments, The basic idea is that ihe number of red colonics 
arising m the second fragmeni not containing the /liJ muiaiion 
will always correspond to background mutntions. Two pst 
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muluiions are very rarely found in ihc iemc allele oC ihe gene. 
We calciilaled ihf mean pcrccniage of red colonics generaled by 
Ihc ncgmivc Iragmeni of each mmor bennng a p53 muUjlion 
Simples mih muniions in the overlapping scgmeni of the (wo 
PC produfis were carefully removed. Only samples with 
> 1 5% or red colonics were laken inio account in ihls analysis. 
This slalislical analysis uf the cuiofT values was based on 39 
samples of P3-PI7 and 32 samples of P4.P!6 fragments. The 
mean percentage ofrcd colonies was J.f i 2,6% for P3-PI7 and 
4 0 i 2.4% for P-1-PI6, Similar mean values were obtained 
when the same analysis was performed on tumors negative for 
p53 mutations. Using cutoff values of 8.6 and 8.8% (mean : 2 
SDs), 7 biopsy specimens gave percentages of red colonies 
ranging between these cutoff values and our previous limit of 
15% (Table 2), No new p53 mutations were detected among Ihe 
surgical specimens. For Ihese 7 specimens, sequencing of 10 
individual red colonies led to the detection ofpS? mutations (see 
•■Materihls and Mc-lhods'-) The case of U32 is also noteworthy 
Pctn dishes translbnmed with the P3-PI7 PGR product led to 
7 2% of red eolonicv and 5,1% of pink colonics. These pink 
colonics have been shown to ongmaie from leaky p53 mutations 
lhai do not completely inactivate function (25, 26). Se- 
quencing of 10 individual clones from pink colonies detected a 
single subsiilulion at codon 180 of the p53 gene in a region 
known to lead to muiani p53 with a mild phcnolype, whereas 
sequencing of individual clones from red colonies led to the 
idennncaiion of multiple mulations arising from PCR amplifi- 
canon. This particular ensniple clearly shows that the split 
eASAY is a very sensitive method to detect mutant p53 in a 
highly heterogeneous tumor sample. 

Therefore, using the new cutoff value defined above 52 of 
84 biopsies (62%) and 14 of 22 tumors (64%) were positive on 
the FASAY (Tables 2, 3. and 4), 

The spectrum of misscnsc mutations was as follows: 1 1 
((j,C-^A T) iransitions, 6 of which occurred at a CpG dinucle- 
Olide, 19 (G.C-^T A), 6 (T A-CG). J (A;T-.T:A), and 5 
(OC->-C,,0) iransvcrsions. Nine framcshill mutations and I 
splice mutation were also revealed (Table 2), The high fre- 
quency of OC—TA transversions, which are usually only found 
in lung cancer paiicnis, is sasociaicd with tobacco smoking (27) 
Five mutations were found in the 157-159 region, a hot spot 
region that has been shown to be the specific target of the 
tobacco carcinogen benzo(o)pyrene (28), The concordance be- 
tween the pattern orp53 mutations described in this article and 
puhlLshed liiersiure based on more conventional procedures 
indicates that ihe funcijonul a.ssay used in the present study did 
not induee any specific selection bias for p53 muuitions. This 
pstiem of muiaiional evenus is not unexpected because Uic majority 
of patients in the prcscni series were smokers (Tables 2 and 3). 

In Ihe series of 22 matched samples of biopsies with 
surgical specimens, 7 samples were wild-type in both samples, 
1 1 had the same mulations, and 4 were discordant (Table 3) 

To validate this FASAY analysis, direct sequencing was 
performed using either DNA or cDNA as starting material. The 
idenlily of the p53 mutation was confirmed in 28 of the 39 
biopsies (71%) and 12 of the 13 (92%) surgical samples 
w-hereas no mutation was detected in the remaining samples 
(Tables 2, 3, and 4), It is noteworthy that cDNA sequencing was 
more sensitive on 3 ,samples, connrming previous observations 



thai mutant p53 RNA may be more stable or may be expressed 
ai a higher level in tumor cells (29) Failure of sequencing is 
cenainly caused by the low tumor cell content m the sample and 
the lack of sensiuviiy of automatic sequencing 

We have recently developed a microarray-based as.say lo 
detect p53 mulations that uses a ihcrmosiable ligase en^mc lo 
discnminale between wild-type and mutani templates, resulting 
in separation of mutation detection and array hybridization 
(13-1 5), This assay was used to efTiciently deled pS] muta- 
tions in surgical specimens from patients with colorectal cancer, 
but Its sensitivity in nonsurgical samples such aj biopsies has 
not been previously tested. Nine surgical specimens and 27 
biopsies with p53 mutations detected by the FASAY were 
available for analysis by the array (Table 5 and Fig, \F). The 
array confirmed mutations in all of the 27 biopsies ( IOO«/n, Tabic 
5), 7 (27%) of which were not confirmed by direct genomic 
DNA sequencing (Tables 2 and 3), Two mutations not detected 
by direcl sequencing were also delected by the array. All /j5J 
mutations were detected by the array for the 8 surgical samples 
For patient C6 in whom biopsy and surgical specimenu were 
both available, histological examination of the specimen and 
FASAY analysis indicated a higher tumor cell conlem for ihe 
surgical specimen (70 venus 30%). Although FASAY easily 
detected a muiaiion at codon 249 in both samples, direcl se- 
quencing of the biopsy failed to detect the mutation, whereas ihc 
DNA chips clearly identified this evem (Fig, 2. A-F) This 
feature can be applied to the majority of the samples analyzed in 
this study and emphasizes the high sensitivity of Ihis array 
technology for biopsy specimens. 



DISCUSSION 

Lung carcinomas are typically late-stage and biologically ag- 
grcssive. which accounts for their poor prognosis (4), The potential 
of new imaging and molecular techniques to significunily improve 
the detection of localized lung cancer provides an unptecedciiied 
opponunity to understand the biology, improve diagnosis, enhance 
treatment, and reduce mortality (30), Furthermore, rrccnlly devel- 
oped proteomic and expression array technologies have inicnsilied 
Ihe search (or new biomarkers that could be helpful m defining 
response to therapy or prognosis. 

Only 30% of patients with non-small cell lung cancer and 
<5% of patients with small cell lung cancer are treated surgi- 
cally, implying that the biological sample mosi frequently avail- 
able for rouline management al the tune of diagnosis is biopsy 
The size and heterogeneity of biopsies rai.se problems for eur- 
rcnl molecular diagnosis techniques. There is therefore an ur 
gem need to develop scn.siiivc assays for the dctecnon of lung 
nimor-specific molecular alterations in routinely available spec- 
imens such as biopsies, bronchoalveolar lavage, or sputum In 
the present prospective study, we demonstrate the feasibility of 
rouline management and analysis of lung biopsy specimens Ibr 
p53 mutation. This includes biopsies obtained using cnnven- 
honal bronchoscopy as well as CT-guided percutaneous biopsy 
To our knowledge, this is the llrsi time that material obtained by 
CT-guided percutaneous biopsy has been processed for molec- 
ular analysis despite the smaller sample size compared with 
biopsies obtained by conventional procedures. This is imporlanl 
m view of the increasing worldwide rate of adenocarcinoma i„ 
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I assay was pcrlonpcd in several cxpcnmcnis. 
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assay was pcrlbrmed in several expenmcnis 



Frequency nf red clones is given for Ihc 5 -part (P3.PI 7) and 3'-pan of p53 IP4-PI6) More ihan 
and all results arc shown 

"^nTrT^ "'/W 1'' I;'"'' ^rl'^ "'"'="='"8 of genomic DNA, +. Ihe s,™e mutation tvas detected ,n DNA •- no moialion deiecicJ 

The discrepancy between the surgical specimen and llie biopsy could be due to liie very low lutror cell content of .he biopsv 
Signal obtained with cDNA amphned from the tumor No signal was obtained with genomic DNa ^ ^ 

I hcie patients received neoadjuvant chemotherapy 
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Ttihk 4 Summary of p}] muloiion analysis 

Mulched buipKics/tumorM 

Single biop«i^_^JJiopiiC! Tumors 

FASAV J9/62 I6J%) 13/22 (60"/.)'^ 14/22 (63%)' 

Sequencing- 22/28 (78%! 7/11(63%) 12/13(92%) 
Chips'' llt,'|g illW/.) 9/9(1 00%) 9/9 ( 1 00%) 

- Functional analysis of separated alleles in yeast (FASAY) rtprc- 
acnls the Irue frequency of p53 mutations in lliis series hecsuse no 
pBticni selection wm performed for llie analysis, 

" Two patients ivere negative for the biopsies but positive for the 
tumor I patient has a different mutation in the tumor and in the biopsy, 
and I paiicnl with a mixed tumors (small cell lung cancer * nun-small 
cell lunu cancer) had a positive biopsy and g negative igmorfsec text for 
more details) 

' Only paiienis with positive FASAV are mdicalcd. No p53 mula- 
iKin WHS r<iund in nc^aiivc paiieni isec lent for detail) 

^Only paiiciiis wiih a p53 mutation and for whom ihc chips nsiay 
wiK available were icsled 



which CT guided pcrculancou.s biopsy is ihe meihod of thoice 
for these pciiphcral lumors. 

Although pSJ inuiaiion.s are common in lung cancer, the 
imporiuiH:!; of these muloiions for the paiienl's clinical outcomi: 
is .>ilill comrovcrsial (5), mainly because of ihc helcrogcneou.s 
srraiegics used to assess p53 mulalional status Immunosiaining 
lacks scnsiiiviiy because of false negatives from nonsense mu- 
laiions, splicing muiations, and dclellons thai do noi lead to p53 
accumulation. In Ihe present study. 10 mutations could not have 
been dcteclcd by iinmunoslaming, and Ihe splice mutation could 
not be delected by DNA sequencing (31). The msjonty of 
molecular analyses have also focused on Ihc study of p53 exons 
.I through 8, In a recent analysis of the p53 mutation database, 
v^e showed lhal this bias result.? in nondeteciton of ~ 1 3% of psi 
mutations, and ihcsc false negatives may bias inicrpretation of 
the results during staliitical analysis. 

In the prtjscnl study, wc compared assays based on cither 
DNA, direct sequencing or arrays, or RNA, the functional assay 
in ycasi Initially developed for the detection of germ-line mu- 
tations, the yeast assay has been widely used for Ihe detection of 
somatic inuuitions in various types of tumors, including a few 
studies in lung cancer (32, 33), The yeasi assay can be used to 
screen |i53 from exons 4 to 1 0, which accounts for >95% of p53 
mutations In Ihe prescnl study, using Ihe new splil assay de- 
veloped by Waridel et al (20) and an experimentally defined 
cutoff value, wc show thai this assay may be sufTieicntly scn- 
silive to detect p53 mutations in samples containing only .s% of 
lumor cells Sequencing of rescued plasmids from red colonics 
allowed unambiguous identification of pS} mutations in al! 
cases, but direct sequencing of genomic DNA was only able lo 
detect 72% uf muialion.s in biopsy specimens. Until a more 
sensitive and specific methodology has been developed, we 
believe that Ihe yea-st assay should be considered as a reference 
method for Ihe evaluation of pJ3 mutations in clinical specimens, 
especially specimens with a low tumor cell content. In addition lo 
the advantages descnbed above, the FASAY can ea,sily distinguish 
true tnaciivating mutations from neutral mulations. Furthermore, 
the U.SC Ufa short amplicon in reverse iranscriptase-PCR also allows 
this assay to be performed on biological samples that could lead to 
exinaciion of partially degraded RNA (19), 
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" All samples analyzed by the array were shown lo conuim a p5) 
mutation after by functional analysis of separated alleles m yensi anal- 
ysis, SEQ, detection of p53 mutalion by direct secjuencing, ARRAY 
detection of p53 mutation by FCR/LDR array. 



Although sensilive, this assay has two major drawbacks: it has 
a tow throughput and il does not provide any information aboul the 
precise p53 inutalion, therefore, requiring sequencing of rescued 
plasmids. Although the fiirit limitation could be circumvenled by 
automation, the second liinitation could be panicularly inconvc- 
mem In view of the markedly heterogeneous behavior of vanous 
p53 mutants, leading to differcni clinical phenorypes. Several stud- 
ies in brca.st cancer suggest ihat only specific p53 muialion.s arc 
assoeialcd with de novo resistance to doxorubicin (9), 

The PCR/LDR/Universal array assay provides both high 
Ihroughpul and allows direci tdcniificalion of the muuitional 
event, a feature lhai considerably reduces the cost of this assay. 
Furthermore, as demonstrated in the prescnl work, ii hiis a 
higher sensitivity than direct sequencing. One of Ihe most useful 
aspects of die PCR/LDR/Univcrsal array is iui vcrsatiliiy be- 
cause Ihc same arriiy can be used for the detection of mutations 
in multiple genes such nspSl.APC. K-rax. or BRCAl (13, M), 
Our laboratories are also developing ihc PCR/LDRyUnivcrsiil 
array to monitor gene promoter hypermcthylation.' which is a 
frequent event in various types of cancer, including lung cancer 
(34, 35) Bcllnsky a al (36) measured hypermediylaiion of the 
CpG islands in Ihe spumm of lung cancer patients and demon- 
strated a high correlation with early stages of non-small cell 
lung cancer, which indicated that pl6 CpG hypermoihylBiion 
could be useftil in predicting future lung cancer. 

We envision the practical development of very sensitive PCR.' 
LDR/Universal array assays, specifically programmed lo a given 
type of cancer such as lung or colon cancer. By querying specific 
genes for each type of cancer {e.g.. gene mutations or hypcrmethy- 
lalion), 11 would be possible to achieve a spccificily of 90-95% for 
identification of tumor cells. Such universal array assays will he 
very usefiji lo assess Ihe tumor conlcm of clinical specimeas such 
as stool, serum, bronchoalveolaj- lavage fluid, and spiiium-- sam- 
pies that arc known to have a low tumor tell content. Using u new 
siandajTiized extraction and conservation protocol, wc have been 
able 10 extract RNA and DNA from bronchial secretions aspirated 
dunng fiber-optic bronchoscopy (bronchial aspirates) thai arc con- 
sidered to conuiin tumor cells. FASAY and chips analysis were 
successfully perfonmcd with this material, indicating the feasibility 
of this type of analysis on heterogeneous specimens " 

Although the specificity of each gene qucned is nui high 
(current chips are programmed to detect only 50% of p5,3 



^ Y-W Cheng and F Barany, unpublished observations, 

C, Fouquci, M Antoinc, N Rabbe, ; Cod/anel, 0 Zaivman, and 
T Soussi. unpublished results. 
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Fif( I Hisiolugy and array analysis or Iwo bron- 
chial biopsies A, B, O. and K. Toluidine blue 
siaining of nn adcnocarcmnma M and fl) and a 
imall cell lung cunccf (/J and ffl 4 and D, x25, B 
and 0. >. |00 C and A', rcsulls of PCMigase 
dciceiion rtsclion/UniversaJ DNA microarray 
analysis iif DSA Addresses are double spoiled 
onto a ilirec-dimenstona] surface con^pnscd of a 
ttiosely cruss-llnkcd polymer ol' acrylamtde and 
acrylic acid. Tbe Ihree dimcnsional surface com- 
bined with ihc ^ip code system allows hybridized 
arrays In be snipped of larger and reused hiducials 
fibelcd Willi (. vA. (lodipy, and Alcxn are spoiled 
along ihc mp and ihe ntihi .\uk' uf Ihe array lo 
provide oricnlanon Amplicon conlrols (CD) arc 
seen in ihc nexl row. ihe Cy} signal indicaies ihni 
samples ItMOS and '»4'l.1 prcseni 175 (i-«A and 
220 A — U muiaiions, respectively 
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mulfnions in Ittng tumors), the probabiliiy of finding an index 
marker among the inuliiplc genes qucrieci is very high. The u.sc 
of multiple lluorochromes could also improve the throughput of 
the assay ' 



I- liaraiiy. uiipuMisticd results 



Many small and early lesions are now bemg delected in 
high-risk individuals by either low-dose CT scan screening 
programs or endoscopic fliiorcsccncc devices, but Ihcir true 
clinical significance remains unccnain. It is not possible ui 
predict which oflhesc lesions will really progress toward ciihcr 
oven cancer for dysplasiic bronchial epilhelial lesiuns m mcl- 
asiaiic disease for early-stage cancers, it may be appropriate to 
target these prcmalignant changes or small stage I tumors lor 
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flfi- 7 Hisiology and amy anaiysu 
of ft matched biopsy and surgical 
specimen from ihc same poiieni Tu- 
luidinc blue slainmg oi' ihc biopsy 
(/i and 8) and surgical specimen 10 
and £) m two magnificaiion;^ A and 
D (X25); Z/ nnd 0 (x IQOI C ami 
F, results of PCR/ligase deletion 
rcsclion/Universal DNA micrDairay 
analysts of PNA Ampticonconiroisi 
(Cil) arc seen m tap row boih 
samples di&plny (he sumc G--*T mw- 
laiion fli codon 2*19 The arrange- 
ment of capiuri; olnjonucleoiidc^ mi 
Ihe array displayed in is diHcrcni 
because ol' a nc* spotiiny proce- 



varly dcictliiin and inlcrvcniion by fully profiling their inolct- 
ular characierisiiti, including cvaluaiion of response lo specif- 
ically mrgclcd mtervcnlion, High-lhroughpul lechnologios such 
as genomics and prolconiics arc becoming widely available, and 
M will be irucial lo apply ihcsc technologies to the detection nf 
early lung carcinogenesis and outcome assessment. However, 
all of thcs'e lechnulogics, including sample management and 
extraction ol nucleic acids, must also be feasible as routine 
procedures in major clinical deparimenis The data presented 



here suggest that the PCR/LDR/Universal anay assay, applied 
to samples containing a minority of tumor cells or DNA. re- 
cruited prospectively, meets these requirements 
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Harmonized Microarray/Mutation Scanning 
Analysis of TP53 Mutations in Undissected 
Colorectal Tumors 

Reyna Favis,' Jianmin Huang,' Norman P. Gerry,' Alfred Culliford/ Philip Pary,^ Thierry Soussi,' 
and Francis Barany 

'D«p«rtmcni of Mjcrofcio/un and immumlogy. Walt Medical College of Cornell Unneniry, New York. New York. 'Colorectal SVrvtf 
prponrnfm o/SurgfT,, Memmal Sfoan /Cd^nng Cancer CerMer. New York. New York: 'EA349}, Laborawire de Gfnow«cofo™ (tei J^meurs 
Injiim Cune. Umveniie PM Curte, Paris. France 

LommunKaud by Kuhard C H. Cuium 

Both (ht mulJtional jiaiut and the specific mutation of TP53 (p53) have been shown to impact both tumor 
prognotis and rcsponic to therapies. Molecular profiling of solid tymom is confounded by infUtrating wld-rype 
cells, since normal DNA can interfere with detection of mutant sequences. Our objective wm to identify TP53 
mutations in 1 38 stage l-IV colorectal adenocarcinomas and liver metasuses without first enriching for tumor 
cells by microdissection, To achieve this, we developed a harmoniicd protocol involving multiplex polymerase 
chain reactlon/ligase detection reaction (PCtViDR) with Universal DNA microarray analysis and endonuclease 
y/ligase inulation scanning. Sequences were verified using dideoxy sequencing. The harmonized protocol 
detected all 66 mutations. Dtdcoxy sequencing detected tl out of 66 mutations 162%) using automated reading, 
and 59 out of 66 mutations (89%) with manual reading Data analysis comparing colon cancer entries in the 
TP53 database (http://p53.curie.fr) with the results reported in this study showed that distribution of mutations 
and the mutational events were comparable. Hum Mutai 24:63-75, 200t. C 2004 Wilcy-Liss, Inc. 
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mismatch recognition 

DATABASES, 

TP53 - OMIM: 191 170. GcnBank: X5'(I56-I 
http,//p53.curic,fr (TP5 3 Database) 



INTRODUCTION 

Lu3s of TP53 (MIM# 191170) is observed in 
approximately one-half of all human cancers, making it 
the most commonly inactivated tumor suppressor gene 
ISoussi, 2003; Soussi and Beroud, 2001 1, By disrupting 
TP53 function, cellular stress signals such as DNA 
damage, oxidative stress, hypoxia, and nucleotide deple- 
tion IVogelstein et al., 2000; Vousden and Lu, 2002) go 
unheeded, creating a permissive environment for se- 
quence errors that lead to oncogenic mutation. Disrup- 
tion ofTP53 activity breaches a second line of defense, 
as TP53 also responds to unregulated growth signals 
caused by the overexpression of certain oncogenes 
|Vou,sden, 2002); thus, TP53 disruption can also 
contribute ro uncontrolled prolifetation of cells harboring 
activared oncogenes Normally TP53 will respond to 
such signals either by arresting the cell cycle to permit 
DNA repair in mildly damaged genomes or by inducing 
apoptosis to eliminate ceils with seveiely damaged 
genomes In addition to preventing propagation of 
genomic errors, TP53 is also implicated in the regulation 
of genes that inhibit angiogenesis and metastatic disease 
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progression. The tumorigenic potential of a cell is greatly 
influenced by the functional status of TP53. 

Many ptospective cancer thetapy studies indicate that 
the fijnctional status of TP53 will also influence a 
tumof's response to therapy For example, the commonly 
used cancer drug, S'-fluorouracil, is ineffective in TP53- 
deficient human cells; however, the DNA damaging 
agent, adriamycin, induces apoptosis irrespective of the 
TP53 status IBunz et al., 1999), Similarly, cells with 
transcfiptionally inactive forms ofTP53 are less sensitive 
10 vinca alkaloids, but become mote sensitive to 

The Supplementary maienal referred lo in rhii trmk cari be ^crrwj ,ir 
http,//ww*,mcericienct wlley,iorn/ipase>/i059,77«<)/siippmai 
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paclitaxel |Zhang et al., 1998). The presence of wild-type 
TP53 activity is essential for therapies dependent on the 
anti-angiogenic agent, TNP-470 IZhanget a),, 2000) and 
ann-anKiogenic combination therapy (Yu et al., 2002), 
while the "ONYX-OI 5" adeno-like virus depends on the 
compleie absence of wild-type TP53 activity in order for 
tran.sduction to occur. Recent studies have indicated that 
breast cancer patients harboring TP53 mutations have 
significandy worse prognoses than those with wild-type 
status IBorresen-Dale, 2003). Thus the functional status 
of TP53 may influence treatment outcome, 

While the functional status of TP53 in a cell affects 
response to therapies and tumor prognosis, several 
studies have suggested that not all TP53 mutations are 
alike. TP53 mutant proteins with a flexible conformation 
correlate with poor prognosis, and different missense 
mutations are known to have differing effects on the 
conformational stability of TP53 |Chen et al., 2001). 
it has also been demonstrated that cell lines ectopically 
expressing various mutant forms of TP53 are insensitive 
to different chemothetapeutic agents jBlandinn et al., 
19991. In addition, different TP53 mutations have 
different cellular consequences. Finally, small synthetic 
molecules have been generated that are capable of 
restoring wild-type TP53 function both in vitro and 
in vivo IBullock and Fersht, 2001; Bykov et al., 2002), 
Thus, the specific TP53 missense mutation In a cell 
impacts response to therapies and tumor prognosis. 

These initial .studies indicate that knowledge of the 
specific TP53 mutation is of growing importance. 
Although current methods for mutation detection 
po.sses5 some very desirable characteristics (e.g., denatur- 
ing gradient gel electrophoresis (DDGEj, DHPLC, SSCR 
dideoxy-fingerprinting |ddF), restriction endonuclease 
fingetprinting |REF)), these methixls are of limited 
utility in large-scale prospective clinical trials involving 
solid tumors [Elsaleh et al., 2001; Kimler et al., 2000 
Nabhokz et al,, 1999, 2002; Soong et al., 2000) due to 
Itiw throughput and/or sensitivity (see Kirk et al, 12002) 
for review), Immunuhistochemical analysis of 142 color- 
ectal tumors demonstrated that only 51% of tumors that 
significantly uvetcxpressed the TP53 protein contained 
DNA mutations IKasetcr er al,. 2000). Likewise, 32% of 
tumors that contained a mutated TP53 gene did not 
concofdanily uverexpress the TP53 protein. Direct 
sequencing and gene hybridization chips fail to identify 
mutations in TP53 avet 20% of the time, due in part to 
dilution of mutant alleles by infiltrating stromal cells m 
solid tumors |Ahrendt et al,, 1999). Thus, the functional 



status of TP53 does not necessarily correlate with 
immunostaining, sequencing, or hybridiiation chip re- 
sults [Kaserer et al., 2000). 

For effective dtug therapy of solid tumots, there is an 
urgent need to accurately assess TP53 functional status 
and to precisely determine the nature of the TP53 
mutation. In order to substantiate that certain factors are 
of major effect in influencing outcome, it is necessary to 
establish statistical significance by surveying a large 
number of tumors. In this study, we sought to improve 
both the accuracy and the throughput of TP53 mutation 
detection by developing a harmonized protocol that 
combines the strengths of two sensitive enzymatic assays. 
Rapid analysis is promoted by creating two complemen- 
tary, parallel tracts with facility for efficient throughput. 
Endonuclease V (EndoV)/]igase mutation scanning can 
detect unknown mutations and allows sample pooling 
IHuang et al., 2002). This method has been shown to 
detect substitutions, insertion/deletion mutations varying 
in size from one to three bases, and scanning ability in 
amplicons up to 1.7 kb (Huang et al., 2002). The 
polymerase chain reaction/ligase detection reaciion 
(PCR/LDR) (Khanna ct al., 1999) has substantial 
multiplexing capability for predetermined mutations, 
which is extended further by coupling analysis to a 
Univetsal DNA microarray [Favis et al., 2000; Cerry 
et al., I999|, Both enzymatic assays have sufficient 
sensitivity to allow analysis of undissected solid tumors, 
which substantially improves throughput. Figure 1 
illustrates how this harmonized protocol functions (Fig. 
I A) and provides a flow chart of how the two parallel 
tracts advance (Fig, IB), The current study provides the 
first report of the TP53 Univetsal DNA microarray and 
details the application of an assay that combines the 
utility of two sensitive enzyme systems to analyie 
mutations in undissected solid tumors. 

MATERJALS AND METHODS 
Tlmior Procurement and DNA Extraction 

Cuntrol DNA samples witli known TPSJ muiationi were 
obtained from preexistins samples archived m T Sogssi'i 
collcciion All paiienri teciuitcd from Memiinal Skmn Kencnns 
Cancer Ceniei undenvent surgical rcscciion for primary adem.- 
carcinoma of the colon. Written informed consent was obtained 
from each subject. The ma;onty (95%) of parients were ideniifit'd 
as Caucasian, while 5% were idennfied as non-Caucasran. A loial 
of IZO primary colon tumors (15 Stage I, 22 Stage II, ■»! Stage fll, 
and 42 Stage IV) were collected at the time of surgical resection m 
accordance with Institutional Review Board approved protocols. 
Two 10 four viable portions of the rumor were harvested by sharp 



(rlBht ,ldO ,ub«qu.n,ly u„. mulrlpl,x\DR7o~a„3v qtl^^^^ PCR/LDR .n..y„s 

analyrsd on a universal DNA microarray and lamplei contalnir^oTP^l ™,..«h„^I [If ? il Th« ligation products are 

interesl EndoV mul.Hon .canning (l.t, .ld7oT^ZV*aw^rom .r.am^^^ 

t.v«ly amplified. Ampllcon. (or the .arne «xon lhat are derlvedTom d'Z^rtumlr .-S,„i^^^^ and tpeclflc ampllcon. are .elec 

Jeced ,o EndoV mul.rlon. .canning. I( mutation, are to*7(I„d ca^d b^^^^^^^^^ '"JTI' l"™' 

reanalyzed In each individual .ample. Automated sequencins idendfie. th. ,p«| "c mT.M,^^^^^ harboring the mulalion is 
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disseccion and snap froicn in liquid nitrogen within 15 min of 
rcmuval. From each tumor, four core samples containing a mean 
normal cell count beiween 30-50% were taken from different 
regions, combined, and DNA was extracted using the Q/Aamp 
Tissue Kit (Qiagen, Chatsworrh, CA) according to manufacturer's 
guidelines. 

Multiplex PGR Ampliflcatlon 

DNA extracted from undissccted primary tumors was subjected 
10 multiplex PGR amplification of exons 5, 6, 7, and 8 for gel-based 
assays, while exon 4 was added to the multiplex for microarray 
assays in order to include the codon 72 polymorphism m the 
analysis of exons 4-8- PGR was performed in a 50 ft) reaction using 
100 ng PNA, 100 hM dNTR I x PCR Buffer II (Applied 
Rinsystems, Foster City, CA) supplemented with 1,5 mM final 
lonccntraiioii of M^l;, 2 5 units AmpliTo.) Gold (Applied 
Biosystems), and 0.4 mM of each primer. Primer sequences, in 5' 
to i' onentHtion, were as follows: exon 4 forward = CCGGAC- 
OATArrOAACAATGGTTC; cxon 4 reverse - GCAA- 
GAAGCUGAGACGGAAAC, exon 5 fiirward » CTGTTCACTT 
GTGCCCIGACTTTC, exon 5 reverse = CCAGCTGCTCAC- 
CATCGCTATC; exon 6 forward = CCTCTGATTCCTCA CT- 
CATTGCTCTTA, exon 6 reverse ^ GCCCACTGACAACCA 
CCCTTAAC, exon 7 forward - GCCTCATCTTGCCCCTCTCT 
TATC, exon 7 reverse = OTGCATCGCTACTA GTATGCAA- 
GAAATC; exon 8 forward -GCACAGCTAGGACCTGATTT 
CCTTAC, exon 8 reverse = CCCTTCTTGTCCrGCTrCCT- 
TAG PCR was performed by heating the reaction for 10 min at 
95 'C, followed by 35 cycles of 94'C for 30 sec, 60"C for SO sec, 
and U"C for I mm. Amplification was vetilied by examining the 
products on a 3% agarose gel. Taxi P<'lvmerase was eliminated by 
incubating the reaction foi 10 min at 70''C with 50 Mg/ml Qiagen 
proteinase K This irearment was followed by incubation at 90"C 
for 15 min to denature proteinase K The TP5i sequence used was 
GenBiink Accession X54) 56 1, Version X54 156,l Gl. 35213, 

Mutation Detection ond Analysis 

Using ihe UMD (Univeisal Mutation Database) software 
described by B^roud and Soiissi |200i|. we analyzed rhe frequency 
i>l TP53 mutational events in colon cancer Among the 1,427 
TP53 muratiuns thai are descnbed for colon cancer, there are 375 
different vanant classes, ranging from a high frequency of 
occurrence (such as g,i3203G>A |p,Rl75H| found 177 times) 
CO 2 30 variants that are found only one time. From this 
information, ii was possible to devise 58 ligose detection reactions 
(LDR) that allow the detection of 70% of TP53 mutation in colon 
cancer. Oligonucleotide design and synthesis, ligase detection 
reaction (LDR), and Tih DNA ligase production were performed 
as previously described (Barany and Gelfand, 1991; Khanna ct al,, 
I999|. LDR primers were divided into two tubes, based ori 
whether the ligation was directed against the upper or lower strand 
of DNA. LDR piimers designed for gel assays are published 
efsewhere |Dong er al., 2001 1, while those for array experiments 
are available in Supplementary Table S I (available online at httpV/ 
www inicrscience wilcyc<>m/)pages/l059.7794/suppmat), 

bollowini; J multiplex PCR amplificatum of the tcgions of 
inteiesi. r,ich mutation is simultaneously detected using a 
thcrmiist.iblc ligase ihat |i>ins pairs of adjacent oligonucleotides 
complementary to ihc sequences of inietesr Ligation occurs only 
when there is perfect complementarity at the lunction between 
(he paired oligonucleotides Ligation ptoducts are distinguished 
based on differential labeling and migration on a polyacrylamide 
gel, ot hybridiiation to specific addresses on the universal DNA 
microarray. The reaction was performed as previously described 
IFavis et al,, 2000|, except 3 \il of PCR teaction was used, QuaHiy 
control for LDR was performed using a synthetic template for each 
mutation to test the abiliry of the full mix of upper ot lower 
ligation primers to produce the expected, specific signal on the 



DNA microarray. In addition, 100 DNA samples derived 
from various types of cancers and known lo contain TP51 
mutations targeted in the PCR/LDR assay were analyied in a 
blinded fashion 

EndoV/llgase mismatch scanning was performed as descnbed 
elsewhere [Huang et al,, 2002). Briefiy, heteroduplexed variani 
and wild-type PCR ampltcons fiom tumor DNA are generated 
using 6FAM and TET labeled ptimers. TheTmoioga mann™ (Tma) 
EndoV rccogniies and primarily cleaves heteroduplex DNA one 
base 3' to a mismatch. Since matched DNA is also nicked at low 
levels, a highly specific thermostable DNA ligase is used to reseal 
)u5t those nicks. This lowers background signal and improves the 
signal-to-noise ratio. Fragment mobihty of cleavage products on a 
DNA sequencing gel reveals the approximate position of the 
mutation, Amplicons from different tumors corresponding to 
specific exons were pooled m groups of three Mutation scanning 
was performed, and samples identified as containing TP5i 
mutations were analyted individually to confirm the mutation 
and then sequenced Amplicons for exons 5, 5, 7, and 8 were 
generated using the PCR primets described fur PCR/LDR .ihiwe 
and were sequenced using the dRhodamine Terniinamr Cvclc 
DNA Sequencing kit (Applied Biosystems, Foster City, CA) 
accoiding to manufacturer's guidelines. 

Mutations m the TP53 gene were analyied with the UMD 
software [Bcroud and Soussi, 2003 1, The TP53 database 
vetsion used for this analysis (Decembet 2002) contains 14,968 
mutations, including 1,516 for colon cancer Genetic alterations 
reported herein use nucleotide numbering identical to Accession 
X54I56 1, 

Microarray Fabrication and Hybridiiation 

Fabrication and hybtidisation wete performed as previously 
descnbed [Favis er al,. 2000; Gerry et al,, 19991, excepi 
hybridijation was carried out m the presence of 100 pg/mi sheared 
salmon sperm DNA, Arrays were reused three times and were 
stripped between uses by submerging for I mm in a soluiion of 
boiling 100 mM Bicine/0,1% SDS Stripped arrays were rinsed in 
nanopure water, excess water was removed using forced air, and 
the arrays were stored in slide boxes at room temperature. 

Quality control for array fabncation was performed on 
representative arrays by staining with SYBR Green II (Molecular 
Probes, Eugene, OR) to determine whether all 64 tip. code 
addresses had spotted. To verify that no cross-contamination uf 
addtesses occurred during spotting, selected arrays were subjected 
to four hybridiiations (stripping between hybridiiations) using 
6FAM-labeled complementary zip-codes. Arrays were hybridized 
such that odd rows, even rows, ocid columns, or even columns 
were selectively targeted to produce specific signals without 
extraneous signals- 

RESULTS 

Validation of the PCR/LDR Tract: 58 Different TP53 
Mutations Can Be Detected Using a Gel-Based Assay 

The TP53 gene is mutated in hundreds of positicms, 
and thus represents an enormous challenge tu muianon 
detection sttacegies (Soussi and Beroud, 2001 1. In 
addition, different cancers exhibit different preferential 
sites of mutation in this gene IHussain et ai., 2000). To 
circumvent problems associated with molecular hetero- 
geneicy m TP53, we inicially developed a gel-based assay 
to detect high frequency mutations in colon cancer and 
established validation criteria for the assay. (Validation of 
the EndoV tract can be found elsewhere (Huang et al,, 
2002).) By focusing on a single cancer and using a 
bioinformarics approach to dictate the assay design, it 
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was possible to engineer the assay sensitiviry such chat a 
significant percenrage {approximately 70%) of database- 
predicted mutacjons for colon cancer could be detected 
jlacopetta, 2003], 

LDR IS a versatile method for discriminating single- 
base mutations or polymorphisms and is ideal for 
multiplexing, since several primer sets can ligate along 
a gene without the interference encountered m poly- 
merase. based systems (see Fig, 1 for description), 
AddicioVially, LDR readily discriminates between wild- 
type and frameshifi or point mutation sequences fFavis 
et al,, 2000; Gerry et al„ 1999; Khanna er al„ 1999, Zirvi 
et al., 1999), 

A total of 1 1 1 LDR primers were designed for this 
initial assay and the reaction was optimized to achieve 
multiplex PGR amplification of exons encompassing the 
DNA binding domain {exons 5, 6, 7, and 8), followed by 
the multiplex LDR detection of 58 mutations and four 
amplicon controls (see Dong et al, |2001| for the list of 
targeted mutations and primer sequences). To validate 
the assay's ability to detect mutations in clinical samples, 
100 blinded, tumor DNA samples containing TP53 
mutations known to be targeted by the assay were 
analyied (see Fig ZB). No information regarding the 
origin or the nature of the samples was provided until 
after results were submitted and samples were unblinded, 
hence all samples were treated identically and subjected 
to multiplexed PGR followed by multiplex LDR. The 
tumor DNA was derived from a variety of cancers, from 
both fresh froiren and paraffin-embedded tissue and 
included both normal and mutant samples. In addition, 
artificial mixes of tumor genomic DNA diluted I 20 
(p,R273H (g,l4')870>A), double mutant p.R196X + 
p,R248Q([g.l33'16C>T)-^[g,14070G>Al),and p.R175H 
(g.n203G>A)) and 1:100 (p,R175H (g.l3203G>A) and 
P.R273C (g,l4486C>T)) in norma) genomic DNA were 
included The results demonstrated that the PCR/LDR 
TP53 assay could delect all mutations that were 
represented by LDR primer sets, even when diluted at 
1:100 (Fig. 2C). (In natural tumor samples where it was 
found that 5% of cells were rumorigenic, we have 
demonstrated successful mutation detection using PGRy 
LDR IFouquet et al., 2003).) The two mutant samples 
that were refractory contained mutations that were not 
included in the LDR primer sets. 

Although PGR error was not expected to affect assay 
sensitivity, this was verified by comparing the EndoV 
mutation scanning results using the proofreading poly- 
merases. Different polymerases for PGR achieve different 
rates of fidelity, with Taq polymerase error rates of 1,3 x 
10 to 3 X 10" per base and proofreading polymerases 
on the order of 5 x 10"'. These error rates are very 
small compared to the target sensitivity of 1 in 100, 
and were not expected to create a signal on the 
sequencing gel comparable to a true mismatch, even at 
a dilution of 1:100. As confirmation, we experimentally 
compared Taq polymerase with commercially available 
high-fidelity mixes containing Pfu or Tgo polymerases, 
with no difference in background observed (data not 
shown) 



Validation of the PCB/UDR Tract: PCR/LOR Coupled 
to Analysis on a UniverMi DNA Mlcroarray to 
AcconwnodaleAnalysU of 110 Different TP53 
Mutations 

To be of clinical utility, it is necessary to be able to 
survey a large number of mutations. In deference to the 
requirement that LDR prtxiucts have precisely-defined 
mobility, analysis using a gel-based system places a 
practical limit of approximately 60-80 alleles per lane of 
a sequencing gel. To overcome these limitations, analysis 
of the LDR products was transferred to a universal 
microarray system. We expanded the number of LDR 
primers for universal microarray analysis so that the 
reaction could in toto accommodate 70% of all 
mutations found in the TP53 database that were 
associated with colon cancer (exon 4 was added to 
PGR multiplex to accommodate codon 72 polymorph- 
isms), 65% associated with osceosatcomas (14 mutations 
and two polymorphisms added with 30 new primers), and 
80% associated with lung cancer (38 mutations added 
using 71 primers). The mutations detected and the 
universal microarray design are shown in Supplementary 
Figure SI (available online at http://www.intBrscience. 
wileycom/jpages/1059-7794/5uppmat) while the LDR 
primer sequences are shown in Supplementary Table 
SI. The universal array ' (Fig. 3A) was validated as 
described for the gel-based assay and then tested for the 
ability to detect mutations in undissected tumor samples 
(Fig. 3B). Each mutation is uniquely identified based on 
the color of the fluorescent signal, the address emitting 
the sigiial, and whether the signal was observed for the 
"uppet" or "lower" strand reaction. 

Dideoxy Sequencing l» Insufficient for Mutation 
Analysis in Undissected Colon Tumors 

DNA was extracted from 120 stage 1-IV colorectal 
tumors and was examined for TP53 mutations and the 
status of amino acid 72. The TP53 gene was analyzed 
using the harmonized protocol and DNA sequencing. As 
described above, throughput for the EndoV tract was 
facilitated by pooling amplicons from specific exons in 
groups of three, these were scanned, and those bearing 
mutations were rescanned individually and sequenced 
(Fig. 1). 

Table 1 shows a comparison between the harmonized 
protocol and dideoxy- sequencing. The harmonized pro- 
tocol identified all 66 TP53 mutations found in the 
tumor set. We can achieve this high level of sensitivity 
and specificity because the two tracts of the harmonued 
protocol complement each other. Of the 66 mutations, 
14 were mutations not represented in the LDR primer 
set, and two mutations fell below the limits of detection 
for this approach {p.H193R (g.l3338A>G) and 
P.G245S (g.l4060G>A) were detectable using synthetic 
template controls), however EndoV muration scanning 
succeeded in detecting these mutations. Similarly, 
EndoV mutation scanning was unable to detect 20 of 
the mutations in four specific CG-rich sites known to be 
refractory to cleavage by this enzyme (Huang et al,. 
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2002 J, bur PCRADR readily decected these mutations. 
Because PCRADRAJniversal array can find mutacions 
that are refractory to cleavage by EndoV mutation 



scanning, while the latter finds deletions or uncommon 
mutacions not covered by our LDR primers, this 
harmonized protocol can find all mutations present in 
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TABie L Compaiiton Between the Harmonized Protocol and Dtdeoxy-sequencing 
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ND 


(6) E224E (S) 


(5) R175H 
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NN 
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ND 
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ND 
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(5) RJ75H 


NN 


(5) R175H 


NN 
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NN 


(8) R273C 


NN 


ND 
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(5) R175H 


NN 


ND 
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ND 
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ND 
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TP53 Mutation 



I 
2 
4 
6 
7 
9 
13 
IS 
17 
18 
20 
24 
26 
27 
29 
31 
32 
33 
35 
36 
38 
40 
41 
42 
43 
44 
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46 
49 
53 
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59 
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65 
66 
67 
68 
71 
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77 

78 

79 

80 

81 

84 

89 

90 

91 

93 

94 

96 

97 

98 
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(75 samples) 
p53 Mutants 
% correct 



17) V234C (W)' 
|5)R175H(REV)' 
(8) R273C (REV)' 

(7) R248Q" 
(5) R175H (REV)' 

17) R248Q" 

(5) K17SH|REV)' 

(6) R213X (W)" 
17) Y234C(W)' 

(8) R282W" 
(8) R273C (REV)' 

(7) R248W" 

(8) R273H*' 

(7) R24«Q" 
(8) R273C (REV)' 

(8) R273H" 

(5) E171X" 
15) G154G (LF, S)' 

(5) G154D (LF)' 
(5) Y163C (LF)' 

(5) R175H (REV)* 
(8) R306X (REV)* 

(6) E224E(LF, S)' 
(5)R175H(REV)' 

(6) H193R' 
(5)HI79Y*' 

(7) RZ48Q" 

(7) R248Q" 

(5) R175H (W, REV)" 
(5) R175H (REV)* 

(8) EZ85K" 
(8)R273C(REV)' 

(8) R282W,' 
(8) R306X (REV)* 

(5)K164X' 

(5) R175H (REV)* 

(7) R248Q*' 

(8) R282W" 

(6) Y205F(LF)' 
(8)R273C(REV)* 

(5)Ql67in«A' 
(8) R282W" 

(7) S26IR(LF)' 
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(7) S240d«IA(LF)' 
(5)R175H(REV)' 

(5) R175H (REV)* 

(7) S261H (Lf)' 
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(6) H214 ln»A (LF)' 

(7) GZ45S' 
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(7) R248Q*' 
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(6) P190L (LF)' 
(6) R213R (LF, S)' 

(7) RZ48W*' 

(7) E258D (LF)' 

(8) R273C (REV)* 
(6)EZ24E(LF, S)' 
(8>R273C(REV)* 

ND 
66/66 
Set at standard 



(8)R282W 
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ND 

ND 
(5) RI75H 
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(6) Q192X 
ND 

(7) G245S 
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ND 
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ND 
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ND 
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41/66 
62% 



15). (6), 17). (8) -exon 5.6,7. or 8( ' -diMcled byPCR/LDR: 



15) Q 167 InsA' 
(8) R282W 

NN 
(7) R248W 
(7)S240 delA*' 
NN 
NN 
NN 
NN 

(6)H214 InsA" 
NN 
NN 
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NN 

NN 
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NN 
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59/66 
89*A 



g.l4028A>G 
a.l3Z03C>A 
g.l4486C»T 
8.14O70G> A 
813203G jA 
g.l4070G>A 
B I3203G? A 
a.l3397C > T 
B.14028A>G 
9,14513C>T 
g.l4486C>T 
g.l4069C>T 
8.14487G>A 
8-14070e>A 
g.l4486C>T 
B.14487G>A 
B.13I90G>T 
9l3l41C>A 
B,I3140G>A 
g.l3l67A>G 
S.13203G>A 
9.14585C>T 
S.13432G>A 
g.l3203G>A 
8.I3338A>G 
a.l3Z14C>T 
9.14O70G>A 
B.14070G>A 
Sl3203G>A 
B.13203G>A 
B.14522G>A 
g.l4486C>T 
B.14513C>T 
g.l4585C>T 
g.l3169A>T 
g.l3203G>A 
S J4070G>A 
9.14513C>T 
gl3374A>T 
8l4486C>T 
3.I3178.I3179lnjA 
g]45I3C>T 
8.14108A>C 
g.l4069C>T 
g.l4045de)A 
9.13203G>A 
B.13203G>A 
9.14108A >C 
fl.J3334C>T 
g. 13401. 13402lnsA 
g.l4O60G>A 
9,132030 > A 
g. 13401. 13402injA 
g.l4487G>A 
S.14468C>G 
9.14513C>T 
8.I4070G>A 
g.l4<)70G>A 
g.l3399A>G 
8.13329C>T 
g.J3399A>G 
8.14069C>T 
g.l4l02A>T 
g,14486C 
9.134320A 
B.14486C>T 
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this cohort of undissected tumor samples, By comparison, 
direct sequencing with automated reading of the 
sequence found only 41 cue of 66 mucations (62% 
sensitivity) and consistently failed to identify mutations 
in codon R248, which is the most frequently mutated 
TP53 codon for all cancers. When direct sequencing 
with automated reading was supplemented with rcse- 
qucncing of both strands and manual reading, the score 
improved to 59 out of 66 mutations identified (89% 
sensitivity). However, for five of the samples, the 
mutation could only be detected by gel purifying the 
PCR product pnor to sequencing. In addition, prior 
knowledge of the site of mutation was required when 
manually reading sequencing results to identify certain 
mut3tK)ns that were present at low levels, 

Analysis of 138 ColoreclalTumors and Liver 
Metastases Using the Harmonized Prolocol 

A summary uf the colorectal tumor data can be found 
in Table 2, We found that 44% of the samples were 
mutant for TP53. The frequency of mutation for TP5j 
was in agreement with previous findings for colorectal 
cancer |!acopetta, 2003; Soussi et al„ 2000). For TP53, 
60 samples had point mutations, four samples exhibited 
deletions, five samples contained polymorphisms, one 
sample was double mutant (insertion of A in codon 
Q167 + P.R282W (|g,13!78_13l79Al + |g.l4513C>T))), 
and one sample was identified as a triple mutant 
(p. K 1 64X + p,R282 W + p.R306X ( Ig. 1 3 169A > T] + (g. 1 45 1 3C 
>T| + |gl4585C>T))). 

As a quality control step before data release, the first 
pass results from the present study were compared to the 
1,516 mutations found in colon cancet that were stored 
in the database. The distribution of the mutations along 
the TPS 3 gene and the pattern of mutation events were 
quite similar (see Supplementary Figure S2, available 
online at http://www.intersctence.wiley.com/jpages/1059. 
7794/suppmat), reinforcing the accuracy uf the harmo- 
nised protocol. In addition, by performing detailed 
comparisons of the current results with the TP53 
database, we were able to identify and quickly eliminate 
a spurious weak LDR signal corresponding to p.R273S 
(g,l4486C>A). This mutation had never been detected 
m the 1,620 entries for colorectal cancer and only 11 
times in the database overall. When preliminary PCR/ 
LDR results suggested the low-level presence of this 
mutation m seven samples, the finding was considered 
highly improbable; the results from the EndoV tract were 
closely inspected and DNA sequencing in both direc- 
rions was pursued. Failure to confirm the borderline LDR 
finding led us to carefully scrutinize the p,R273S 
(g. I4486C>A) reaction. It was revealed that human 
error led to the synthesis of LDR oligonucleotides that 
did nor correspond to the correct signal. Since our 
synthetic template to test ligation reactions was designed 
as reverse complements of the joined LDR oligonucleo- 
tides, early tests of the chip did not identify this reaction 
as problematic. As a result of this finding, we will tn the 
future design synthetic templates de novo, without 



referring to the LDR product for the reaction. Thus, by 
relying on a bioinformatics approach, spurious signals can 
be readily distinguished from authentic mutations and 
processes can be continuously refined and improved. In 
contrast, mutation detection at the very same R273 
codon by gene chip hybridization was unable to 
distinguish a weak signal resulting from false hybridiza- 
tion of wild-type sequence [Wen et al., 2000) from the 
presence of low-level mutant allele. 

A common polymorphism in TP53 is located at codon 
72, Since LDR had been shown to be sufficiently 
sensitive to detect the presence of a variant sequence 
when diluted MOO in wild-rype sequences (see descrip- 
tion of gel-based assay above), determination of the 
status of both codon 72 alleles was possible, even m the 
event of I7p loss of heterozygosity (LOH) In the tumor 
cells, due to the 30 to 50% esiimated stromal 
contamination In agreement with previous studies thrit 
showed the allele frequencies in Caucasians to be 
approximately 70% arginme allele and 30% proline allele 
IBeckman et al,, 1994; Harris et al., 1986), we found 
allele frequencies of 75 and 25% for the R and P alleles, 
respectively. 

DISCUSSION 

We have designed a harmonized protocol that relies on 
two sensitive enzymatic reactions to detect the presence 
of mutations in undissected solid tumors, in which 
contaminating wild-type stroma can account for the 
majority of DNA template present in a sample. The 
universal microarray-based tract uses a thermostable 
ligase enzyme to detect predetermined mutations by 
discriminating between wild-type and mutant templates, 
resulting in the separation of the mutation detection and 
array hybridization. The mutation scanning tract detects 
unknown mutations and relies on thermostable EndciV 
and ligase to produce high sensitivity. The hatmonized 
protocol successfully detected all 66 mutations found m 
the present study, many of which were missed by DNA 
sequencing. 

Although DNA sequencing using manual reading 
performed significantly better than automated reading 
(89 and 62% sensitivity, respectively), identification of 
mutated bases ultimately relied on prior knowledge of 
mutation position, as indicated by results from the 
harmonized protocol. Overall, the fusion of PCR/LDRy 
Universal array and EndoV mutation scanning proved rn 
be a rapid means of identifying mutation. While It may 
be argued that other combinations of mutation detection 
methods might result in equal sensitivity (e g., SSCP + 
sequencing), these applications still fall short. Whereas 
elecrrophoretic mobility assays can detect low level 
mutations, these approaches may miss a significant 
portion of mutations: SSCP misses 30% of possible 
mutations (Bjursell et al., 2000; Hayashi, 1991; Korn 
et al., 1993; Makino et al., 1992; Suzuki et al., 1990)' 
methods such as CSOE, DOGE, CDGE, and DHPLC, 
which look for differential electrophoretic migration 
between homo- and heteroduplexes, have been shown 
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TABLE 8. Summary oiColorectolTbmor TP 53 MuCatton Anelysls 



Vaiiabls 



Outcome 



Frequency 



Percent 



Tumor Stag« 



TP 53 Mutation/Polymorphism Summary 



TP53 AmInoAcId 72 Polymorphism Status 



TP53 Non$en« & Ml»»en«« Mutations 



TP53 Siruclural Motifs with Mutation 



TP 53 Mutations and Polymorphisms* 



I 

II 
III 

IV 

MetBitosvs 
Total 

WT 
Mutations 
Polymorphisms 
Total 

P homozygots 
Het«roiygcte 
R homoiyflote 
Total 

Mlss«ns« 
Nontens«/indel 
Total 

DNA contact 

L2 loop 
Beta sandwich 
HZhelU 
Clerminus 
Hlhelln 
L3 loop 
Total 

R175H8l3203G>A 
RZ48Qg.l4070G>A 
R273C8M486C>T 
R282Wg,M513C>T 
R248Wg.l4069C>T 
RZ73Hg,M487G>A 
R306Xg.l4585C>T 
S261R8.M108A>C 
Y234Ca.l4028A>G 
H2M 9.i3401.1340ZlnsA 
E224Ea.l3432G>A 
R213R8.13399A>G 
G154G8l3141C>A 
Emx 8.13J90OT 
E258D 9.I4J02A >T 
EZS5K 8.145220 > A 
G245 S8.14060G>A 
G154Dg,l3140G>A 
H179y g.l32J4C>T 
H193Ra,J3338A>G 
R267Gsl4468C>G 
P190L9.13329C>T 
Ql92Xg.l3334C>T 
RZ13Xe.I3397C>T 
Q167 g.I3178.J3179lnsA 
K164Xb13169A>T 
SZ40 B M045d*IA 
V163Cg.J3l67A>G 
YZ05F 8.13374A>T 
Total 



15 
22 
41 
42 
18 
138 

75 
61 

5 

138 

10 

55 
73 
138 

51 
10 
61 

28 
IB 

4 

6 

2 

1 

2 
61 

11 

8 
7 
5 
3 
3 
2 
2 
2 
2 
2 
2 
2 
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to miss 11% of polvmorphisins |Chen and Thilly, 1994; 
Fodde and Lfisekont, 1994, Ganguly, 2002; Ganguly et al,, 
1993; Guldbetg and Guttler, 1994; Khrapko cc ai,, 1994; 
Kodowski and Krzyzoslak, 2001; Larsen et al., 2001, 
Mitchclson, 2001, Ridanpaa and Husgafvel-Pursiainen, 
I99J; Rozvcka et al, 2000). DNA sequencing, on the 



other hand, has low sensitivity and will miss low-level 
mutations. To optimize paired reactions based on these 
approaches, microdissection would be requited. The 
major strength of the harmonized assay is that mutation 
detection may be implemented in the absence of 
microdissection to enrich for tumor DNA. 
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One advantage of rhe harmoniied protocol is that rhe 
two tracts of this cotnblned process use the same 
tmuiliplex PCR reaction, thus this approach does not 
consume excessive amounts of limited tumor DNA 
sample Also, the two tracts of the process advance m 
parallel and there is no need for microdissection on 
either tract, thus this approach is also less time 
consuming. A current limitation to the wider application 
of the harmonized assay may be the lack of comprehen- 
sive databases for other important tumor suppressor 
genes and oncogenes. To program the PCR/LDEV 
Universal array tract in the present study, we used a 
bioinformarics selection process to guide us to the most 
significant TP5J mutations in colon cancer. The 
legitimacy of the programmed mutations is supported 
by a recent report (Kato et al., 2003), in which 2,314 
TP53 mutations were evaluated for functional impact 
using a yeast-based assay. With the exception of A161 
mutations, TP53 mutations included on the universal 
array showed a loss of activity, demonstrating the efficacy 
of 3 bioinformatics selection process, If similar database 
resources are lacking for other genes of interest, it will 
first be necessary to build the databases that can clarify 
which mutations should be targeted for analysis. The 
rapid advances in sequencing and mutation scanning 
technology over the past few years will undoubtedly assist 
in expanding web-based databases of both germline and 
inherited mutations in cancer-associated genes. Addi- 
tional bioinfunnatics resources will likely be launched if it 
can be shown that tumor profiling can be successfully 
applied ro rhe clinical decision process. 

In comparison to previous analyses using direct 
hybridization to gene oligonucleotide arrays ro detect 
TP53 mutations in fro:en tissue, tumors deficient in 
neoplastic cells required selective microdissection. 
Although the intent of the gene hybridiiation chip was 
to detect all TP53 mutations, it failed, detecting only 
81% lAhrendt el al„ 1999], 81% jWikman et al., 2000], 
and 92% |Wen et al,, 2000) of TP53 mutations. In all 
cases, insertion/deletion mutations proved intractable to 
this detection scheme, and significantly reduced the 
sensitivity values. Further, gene hybridization arrays 
required statistical considerations on background to 
improve specificity from 34 to 86%, but at the cost of 
reduced sensitiviry from 92 down to 84% IWikman et al,, 
2000|. In contrast, the LDR primers were predicted to 
accommodate roughly 70% of colon cancer mutations (as 
estimated by prevalence in the TP53 database) and PCR/ 
LDRAJniversal array succeeded in identifying 68% of 
TP53 mutations found in the undissected colorectal 
adenocarcinomas analyzed (Table 1). By creating a 
harmonized protocol involving both PCR/LDIVUnlversal 
array and EndoV mutation scanning, all TP53 mutations 
in the targeted exons arc detected, including inserrion/ 
deletion mutations, thus achieving high sensitivity with 
high specificity This result demonstrates that the PCR/ 
LDR-bioinformaiics approach to universal chip develop- 
ment combined wiih EndoV mutation scanning pre- 
sented here out-performs approaches that attempt to 
r,-irget all possible mutations, such as gene chip 



hybridization or automated sequencing. The added value 
of our assay is that time-consuming microdissection is 
eliminated. 

In conclusion, rapid and accurate mutation analysis of 
tumors is crirical to resolve differences in prognosis and 
response to therapy Due to the comparatively advanced 
state of understanding, the TP53 gene is a strategic 
starting point to demonstrate proof of principle and to 
advance the translational research to achieve this goal. 
Approaches that prove successful for TP53 mutation 
detection (e.g., curation of comprehensive mutation 
databases, bioinformatics-based approaches to assay 
design, and development of rapid, sensitive assays) may 
also be applied to characterizing the mutation status of 
other tumor suppressor genes and oncogenes. In time, 
the ability to perform molecular profiling of tumors may 
facilitate tailoring individuahzed treatments for indivi- 
dual patients. 
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DNA methylallon In CpG IsUnds is associated wlih iranscrtptlonal silencing, Accurate deiermlnatlon of cytosine 
merhylatlon status In promoter CpG dinucleotldes may prcjvlde diagnostic and prognostic value for human cancers. 
We have developed a quantitative PCR/LDR/Unlversal Array assay that allows parallel evaluation of methylation 
status of 75 CpG dinucleoildej In the promoter regions of IS tumor suppressor genes [CDKNZS, C0KH2A, CDKN2D, 
CDKNIA, emit, TPS}, IKCAI, TIMPl AK, MSSFI, CDHI. MCMT, DAfKI, CSTPI. and RARS). When compared with an 
independent pyrosequenclng method at a single promoter, the two approaches gave good correlation. In a study 
using 15 promoter regions and seven Winded tumor cell lines, our technology was capable of distinguishing 
methylation profiles that Identified cancer ceil lines derived from the same origins. Preliminary studies using 96 
cplorenal tumor samples and 73 matched normal tissues indicated CpG methylation Is a gene-specific and 
nonrandom event in colon cancer. This new approach is suitable for clinical applications where sample quantity and 
purity can be limiting faaors. 

ISupplemental material Is available online at www.genome,org.| 



Atierranl itieihylfltion ol CpC. dinucleoiides In the ^' regulatory 
region of gene'^ oftrn results in liansiripilonal Inactivation and 
lias Iwen iinpiltaied in aging, heart and neurodegenerative dis- 
eases, as well as In the parhugenests ut various types of tsntcrs 
(Felnhcrg and Vogelsiein 1983; Gardlnet-Oatden and Prommet 
1987; Post ct dl. iy99; Bayiin and Herman 2000, Robertson and 
Wolffe 2000; Warnecke and Besior 20(X); peinberg 2t)01; Jones 
and Bayiin 2IX)2; Ciil el al. ZWi). Phere Is a glowing Inieresi In 
understanding Ihe correlation b«rweeri abeirant DNA methyl- 
ation and tumorlgenesis (Huang el al. 1999; Toyota et al, 1999; 
idsii-llo iM ,il ZtXO: Vamashlta et al. 2003) In order lu latllitate 
disease marker dlscnvery, dlagnosilc uml developmeni. and Ihe 
Mudy 111 Hieinomer.ipeiiilr resjKmsi" (l.aird 2001) 

dirreni methods (or detecting .S-mclhykyioslne can be dl- 
vidi-d mio three iriaior approathes iL.iird 20IIJ): (1) prolihng 
meibvlailnii glohiaKy, (2) Idemllying methylation paiierris at a 
cluster of CpG sites, «id (Jl determining methyljiioii levels at 
individual t:pi; dinucleotldes, Kach category olleis a different 
peripc'Ciive lor studying DNA methylation In genetal, global 
screening methodt rely on methylation-sensitive restriction en- 
zyme ingestion and provide opportunities lot new eplgenetlt 
marker discovery (Huang et al, 1999, Toyota el al 1999; Costello 
CI al 2U0U; Yamashita et al, 2U03|, Methylallon-speciflc l>CR 
IMSP) (Herman et al 1996) and variations of ttils ptocedure (Ciit- 
Irell and l-jlr<l ZOO.i: Zeschnlgk et al 2001) were introduced lu 
study the nielhylalion pattern n( a few rioseiy neighboring CpG 

'Corrvsponding *gtt)or. 

E mail l>iir«i)r»m*il.(orn«H.<du,' lai (213) 74«-«l04. 

Arlicie putjIisheU online jneacl ol prim Aniclp dnd pubiitalioii dale are at 
t^UpV/www genooie Ofg/cgi/doi/i 0 1 lOWgr 418)406 



sites Since ONA methylation Is believed lu be an early evcni 
during (.ardnijgenesis (Laird 1997), the high senslrlvity ol MSI' is 
suitable for use as an eatly detection tool on known epigenetic 
markers (Hoque et al. 2004). Put quantitative assessment of in- 
dividual CpG dinucleotlde methylation status, the commonly 
used methods, including bisulfite treatment, were followed by 
sequencing (e g,, bisulfite sequencing and pyrosequenclng) 
(I roirimeret ai, 1992. Uhlmann et al 2002, Uupunt el al, Zim, 
Vang et al 2004), primer exienslon (e g , SNul>)) (Gon^algo and 
(ones 1997), restriction enryme digestion (eg., e:ORRA) iXinng 
and laird 1997), or real-time I'CR (kiesthnigk ei al. 2004) Ihese 
assay* |jtomJe quantiiaiive prnllllng ,>r detailed analysis of 
5-meiliykyiosiiie distribution The quantitative iriformjtiun 
generated is curienily being used (oi correlating disease-S|)eci(ic 
meihylation markers to ilinival ouivoiiies and faillliating ilie 
discovery ol anil-lumiirlgenic drugs (C heng et al 2004; Issa 
2IXM) However, the current methods analyze CpG methylation 
status one gene at a lime and have limited nHiltiple«ing capahll 
iiy Hisullite sequencing provides the must comprehensive dai.i 
on methylation status ai every CpG but requires subcloniiig and 
st^utnce analysis of 10-20 Individual clones. Higher Ihroughpui 
has been achieved by combining bisulfite- PCfi with microarray 
technology utiliiing oligonucleotide ptohes designed to futm a 
perfect match wlih either methylateit or unmeihylaied alleles 
within the target sequences IAdor|,ian el al- 2002, Balug et al, 
2002; GItan et al 2002), Tills allows parallel evaluation of CpG 
methylation slants al numetous CpG sites across multiple ge- 
nomic regions ot interest. However, bisulfite treaimeni renders 
genomic UNA into A l -rlch secjuences, which exacerbiiles nun- 
specllic and niisni.itcli hybridiiatiuns due to differences in an- 
nealing lempctatures berween dlflereni probe sequences. In ad- 



Cenome Research 1 

www gt-nnmt' urg 



Cheng et al. 



dillon, probes containing iwo or moit? CpCi dmucleoudei may 
lack the «nsltlviry lo tlisiliigtiish partially me thylated iixjuencfs 
from those th.il are fully methylatccl in heterogeneous clinical 
samples 

We seek to develop a robust assay lor clinical application 
that provides ijuantnarivc incihylation levels for multiple CpG 
dliiucleoildes in a given genomic region, as well as allowing spe. 
cilk evaluation of many genes In parallel. Such an assay can 
provide a represeniattoiial CpG methylallon profile of candidate 
genomic regions, and this profile Informntlon may be usehil for 
disease stratification or as predictors of therapeutic response This 
work presents a new method that aims to subslanllully Improve 

B(»umt«T>CR-PCRajJIVUnlv«™»l Anay 



quantitative microarray-based methylatlon detection lo meet 
these needs As illusirated in figure 1, combining I'CR, llgase 
detection reaciinn (1,1)11), and universal .Array (where zip. code 
sequences appended to l.DH primers, guide products tu zip-code 
complements on an array) iGerry ei al, 1999: Favis et al, 20()()) 
allows multiple.xlng and provides high specihclty and accuracy, 
A detailed, quantitative methylation profile of essentially any set 
of CpG dinucleuildes can he determined tiy using this assay, 
fifteen tumor suppressor genes commonly linked lo ttanscrlp- 
tlonal silencing In various human cancers were chosen and ihe 
methylation status of their promoter regions evaluaied (http w 
www.mdanderson.org/departments/meihylailon) Up to six 
CpG dinucleotides per promoter regions 
were Investigated and a total ol 75 CpG 
dinucleotides queried (wr sample. 
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HgMre 1 Schematic diagram of the assay Two hypothetical CpC Oinucleotide sites 1 and 2 are 
designated as methylated and unnnethylated, respectively Sgdium bisulfite convens unmcihylaied 
t>ut not methylated, cytos.nes into uracils. This conversion renders the genomic DNAs into two asym- 
metrical, noncomplemenlary strands, and only one designated bisuliite-modilied strand (hiqhiiahted 
in orange) li amplified and jnalyjed In the initial amplrticallon, PCR primers are designed with a 
gene specilic J' poroon and an upstream universal sequence (highlighted in black). This universal 
sequence is used as a PCK piimer in the subsequent PCR to umulianeousiy amplify all the pnmar. 
amplicons (lor ease ol illuilration, only one amplicon Is shown) LDR is performed in a multiplex fashion 
with three pnmen (two disciimmating and one common piimeri) interrogating each ol the selected 
CpC sites The diicnminaiingprimen contain a S' fluoreicent label and j )■ discrimmaiing nucleotide 
10 Uelermine either methylaisd (with S CyS and 3 C) or unmelhylateO (with 5' Cyi and J' n 
cytosines The common pnmers bear a 5' phosphate jnd a J unique /ip.code complement sequence 
(e g , c<;ipi and c/Tip?) ligation occoo only if the nucleotides at the ligation junction aie perfectly 
base-paired with a complementary template uno the ligahon products are captured onto a Universal 
microarray with prespotted /ip cQdei (adOresvesj lor e.ample, address ^ipl identiiies methylated 
cytosine ,n methylation site I , and address ^ip2 ideniilies unmethyiated cylosine in methylation sue 2 
ihree Umveisal microanjy addresses are assigned lor each promoter region, and each adoress u 
double- sponed to ensure the quality ol anay fabrication and oligonucleotide hybiidiiation ettlciency 



Results 

Decermlning assay ipeclflciiy and 
quantiurlve accuracy of 
bIsulflle-PCR/ LDR /Universal Array 
I he general design of the assay is illus- 
irated in figure I (ieiinmic DNAs were 
treated with sodium bisulhte to cunvert 
unmelhylated, but not methylated, cy- 
toslnes Into uracils We have modified 
tfie standard bisulfite protocol lo ensure 
a ihorough deamlnation of unmelhyl- 
ated cytoslnes and increase DNA recov- 
ery (ttoyd and Zoii 'mw. Cenc-speci/U 
PGR primers bearing V universal tails 
were designed tu flank each promoter re- 
gion. A second, universal I'CI! siep al- 
lows approximately eqiin) fragment am- 
pli/lcatitin of all sequences amplified in 
the ptimary ICR, ,Slnce I'CK is not the 
final readout in this assay, prinier design 
is flexible and less tonstralned by se- 
quence context and is independeni of 
CpU dlnucleoilde methylallon status. 
Three LDR primers were used to deter- 
mine the methylation status uf each 
CpG dinucleutkle I.DR primers werede. 
signed 10 tolerate mismalched base pairs 
at Internal CpG sites and allow hybrid- 
ization to ftilly and partially methylated 
sequences, as well as onjriethylated se- 
quences A high fidelity Tth llgase (f.uo 
et al, 19961, only ligates the upstream 
(discriminating) and downstream (com- 
mon) primers when the ,r discriminat- 
ing nucleotide al the lunction Is compie 
meniary la the DNA template This 
leaiure allows accurate, quantitative de- 
lect ion of idrgeted l.p(/ dinucleoildes re- 
g.irdless of ihe presence ol internal CpG 
dinucleuildes within the primer se- 
quences, for example IRg 1 1, at the 
methylated C:pG site 1, only Cyi-C- 
labeled ligation products are formed, 
whereas only (:y,S.T-.|abeled ligation 
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products ,ire fcirmKi ai ihe unmeihylatfd c:pG sire 2, Unique 
cojnplenicuwry zlp-ioile je^ucntes on the 3 ends of common 
primers guide I DH products In Itieir corresponding jlp-codes on 
a Universal Array (Gerry ct al 1999; lavis ct a( 2000). The 2ip- 
codes are unique sequences designed wiih a consiani T,„ and 
have no homology to either the rargel sequence or to other se- 
quences In the genome This design eliminates false signals due 
to nonspetitic binding and mismatch hybridlzailons 

The assay was validated on genomic I5NAS extracied from 
tsvo commonly used lolorectal iHCIIS) and proitarv cancer 
II.NCaPl cell lines. Prnmoter regions chosen in ihis study have 
huliideo 15 tumor sufipressors (C7«.V?« [formerly knoivn as 
/lis""'"}, !:r>KN2A Iformeily knnsvn as pif,'''""-], LpKN^n |for- 
merly known ,is pN "']. COKNIA [formerly knosvn as p2r "l 
C DKNin llormerly known as p^'"'"'), |fonnerly known as 
UHLAI. HMI-.i. AI'C. K.4i.Vf7, cum Hormerly known as 
F-CAOl MGMT, DAPKI jlormerly known as IMPK], GSTPI, and 
RAKB liormerly known as /M«(S|) and one heml-meihylated Im- 
printed gene (SNRPN, as an Internal control). The methylallon 
profiles of the candidate promoter regions were determined by 
bisulfite sequencing, which revealed CDKN2U, C0KN2A, 
CDKN2D, CDKNIA, CDKNIB, TPSI, BRCAI, DAPKI, COHl, 
MGMT, and TIMP3 were unmelhylated In I.NCaP, while 
C0KN2A, CDKN2D. MCm; RARB. and RASSFI were methylatecJ 
In HCn5 among IS tumor suppressor genes, the Initial WW 
Universal Array a.ssay was designed to evaluate melhylatlon sta- 
tus ol three CpG sites per promoter region. LDR primers delect- 
ing methylated and unmelhylated cytosines were validated by 
using in vitro meihylalcU (Ssil melhylase) and untreated normal 
human lymphocyte genomic DNAs, re.spectlvely (data not 
shownl l-iillowing bisulliie treatment, genomlt DM of each cell 
line sample was multiplex l>('R aiDplilled, and Ihe pooled I'l'R 
[irinkuc were vuhieiteU Ici LDR.'Univeisdl Acrav analysis iHg 
2AI We ifsied ihc assay speilllcliy tiy using I NC.aP DNA and 
suhicis ol LDR primers thai deieit only unmelhylated cytosines 
iFig 2rt data not shown; fhecaptureolCyS fluorescenceslgnals 
only at the deslgnaiecl zip-code atldresses lor each LDR primer set 
indicated that I.DR/Univeisal Array did not generate nonsfieciflc 
ligation products and that mismatch hybridisation was absent 
To further demonstrate the assay's accuracy, I DR primers thai 
detected only methylated cytosines were used to Investigate a 
total of 48 CpG sites simultaneously foi each cell line (I'ig. 2CI. 
Our data are consistent with the bisulfite sequencing result.s, In- 
dicating an aci-urate methylatlon protlle was obtained Different 
levels of fluorescence Intensity were observed at several sip-code 
addresses. These varliitioni suggested that the targeted CpG ill- 
nucleotldes may have dlflcrent methylatlon levels within the 
same promoter regions (e g., KASSH and I IMP}). 

To determine If ihe assay could be quantitative, genomic 
DNA (rum I(C"I"I5 (carrj'lng methylated CpG dlnucleotidesi was 
mixed with normal human lymphocytes (carrying unmelhylated 
.illeles). Mil h that the test samples ccmtalned O'Ki, 20'«,. 40%, 
otyK,, Styh,, and lixyn, of HCT15 DNA These mwures were sub- 
lected to blsultite l'CK/LDR/Unlvcrsal Array analysis (Hg. 
data noi shownl The average Puoresceiice Intensity representing 
either methylated (Cy.)) oi unmethylaied |Cy5) alleles from each 
double-spotted zip-code address wa.s used to calculate the meti). 
ylailon ratio o( Cy3/(Cy.) t C>5). Each experiment was repeated 
at least twice and produced consistent results. Most of the Cptn 
dimicleotldes we evaluated have H' values between 0 9H and 
0.89. Those CpG sites that gave lower K' values are likely due to 
Inefficient competition between l.Dlf primers targeted toward 
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FIgiir* I. Representative bhulfite-PCR/LDR/Unlversal Array analysis ol 
16 promoter regions ol cell lines HCTIJ and LNCaP. (A) for the ease of 
demonstration, either live or six promoter regions were ampHtiett in one 
PCS, and a total of 16 genes were simultaneously analysed The gene 
names and the corresponded PCR fragments are as follows: (lane I) 
CDM2S (3U bp), COmiA (J63 bp), CDKNIA (391 bp), CDt(N\a (4J6 
bp), iNRPH(M2 bp), and BHCAI (459 bp); (lane 2) C0KN2D (346 bp) 
riWP3 (404 bp), <(/>C(433 bp), RASSfl (474 bp), and COtil (513 bp); and 
(lane J) MCMT (562 bp), TPS) (418 bp), DAPKI (434 bp), CSTPI (50? 
bp), and RARB (Si2 bp), (S) IDR/Universal Array analysis of the unmeth- 
ylaied cytosines in LNCaP ampllcons. All PCR products were pooled as 
LDR templates, but only selected LDR primers were used In each reaction 
(LOR setT SNRPN, C0KN2S. CDKNIA, LDR S«t2: C0KN2A, TPSl BRCAI) 
The subset ol promoter regions that were interrogated in each LDR are 
depicted in the diagram (green circles) under each array image. The 
Cy5-labeled LDR products (false color green, designed for unmethylaied 
cytosines) were captured on Universal Arrays. (O All PCR products si each 
sample were pooled and subjected to LOR/Unlversal Array assay Only 
CyJ-labeled LOR primers (false color red) were used in this assay to detect 
methylated cytosines The diagram under each array mage deplcu rhe 
correlated iip-codos (circles) that were assigned to represent the CpC 
melhylatlon status in each of the 1 6 promoter regions Each iip-code was 
double-spotted on the array to ensure fabrication quality Red and empty 
circles represent methylated and unmelhylated CpC sites, respectively 
Pink circles represent those CpC dinucleotides that have lower level of 
methylatlon The PCR and LDR primer sequences and their concemra 
lions used in these experiments were listed in the Supplemental Taplej i 
2, and J. 
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FIgur* J, The quamidcalion curves of me amy. Genomic DNAs o( 
HCTl S and normal human lymphocytM w«e mi«eB in 0%, 20%, 40%, 
60%, 80%. and 100% ratigi and subjected lo bisulfite- PCR/IOR/ 
Universal Array analysis, (A) Representative array images are shown 
scanned in both Cy3 and CyS channels. False color red (Cy3) and grwn 
(CyS) represent the methylated and unmethyfaied alleles of CpC di- 
nucleotides, respectively Color composites o) the two channels reflect 
the methylation levels, each ?ip.code was double-spotted on the array to 
ensure fabrication quality IvtCfvIT and riMPJ were used as examples lo 
show the assay llneaiify meaiuiecf at individual CpC (JInucleotidei, The 
plotted value ai y a«ii represents the fluorescence intensity Cy)/ 
(CyJ rCyi)iatio The value ai J a«ii represents ihe percentage ol HCTl 5 
mined with normal human lymphocyte genomic Df^As The fl' and P- 
values of each linear regression line were calculated, although the lines 
were omitted m the piois for visual clarity Nearly no meihyialion was 
obsenred at iwo ol ihe CpC sites of timp] resulting m poor statistical 
correlation (circles R' » 0 81, 0 Oi; crosses R' = 0 OB, ^ = 0 01) The 
enperimenis were repeated three limes with diflerent sample piepara 
tioni and array hybridiiaiicjns. The fCH and IDf* pnmer sequences and 
their concentrations used In these experiments are luted in Supplemental 
Tables 3 and 4. (S) Comparison of Ihe MCMT methylation level from 1 5 
colofBctal carcinomas using pyrosequencmg technology and bisulfite/ 
PCR/LDR/Universal Array Three CpC sites were evaluated per ONA 
sample. The plotted value at y-ajiis represents the percentage of meth- 
ylated cytoslnes in the tumor samples as obtained from pyrosequencing 
The value at r-axis represents the ratio ol fluorescence intensities Cy3/ 
(CyS i CyS), The mixed genomic DNAs ol HCTl 5 and normal human 
lymphocytes shown in figure 3A were included as controls. 



uiimelhylated anj methylaitU alleles antl tan be resolvtd by re- 
di;slfi'iins the I.DII primers to have a hiRher mpltitiB lempetature, 
Diir analysis conflrmeil the diffeient perfenlase uf methylation 
at each ( pO ilinucleoiide and susfjestPd that inelhylaiion ies-el is 
nut lOt^Ki at each Lpl, she in luniui cell line ONA, l-ur example. 



for IICri5, a medium level of methylation was obsetveJ at the 
first CpG iliiiucleotlcle and <\(yib methylation ai the olhtr two 
C\iC, dinucleotldes In TIMP,!, By comparing the ratio of (meth- 
ylated): (methylated t unmelhylaied) UNA in different cell lines, 
we could extrapolate the C;pG methylation level at a given posi- 
tion Alternatively, SssI could be used to melhylate DNA in vitro 
to completion to generate standard curves for calibration (data 
not shown). We tested assay sensitivity by mixing tumor cell line 
DNA with normal human lymphocyte DNAs A preliminary 
study suggested ihai the unbiased PCK primer design was suffi- 
cient to detect the presence of methylated alleles, even when 
diluted down to I'Hi In unmelhylated alleles (Supplemental f-ig 
1) Nevertheless, in the cnloreclal cancer study, we currently use 
l(m-\i% as a cut-off for uiir scoring criteria. Overall, out data 
demonstrate that bisulflte-PcrR/I.DR/Universal Array approach is 
a c|uantiiative and sensitive melh(,id for the measurement of UNA 
nieihylailon 

Comparijon of the method with a pyrose£]uencin8 assay 
using clinical cumor samples 

lo further evaluate the i)Uiiniiiatlve accuracy and clinical utility, 
we analysed MCMT nieihylailon level at ihree CpC dlnucleotiiles 
per sample on a total of 15 colorectal tumors, using both our 
assay as well as by pyrosequencing. Genomic DN'As of these tu- 
mors were bisulfite treated, amplified by using multiplex I'CH, 
and analyzed by I.DR/Unlversal Array. Alternatively, MCMT was 
unlplex amplified from the same tumor DNAs lor pyrosetjuenc. 
Ing. Three sequencing pnmers were designed to investigate the 
methylation level of Che same CpCi dlnucleotiiles that wete stud- 
led by l.DR/Univcrsal Array, The comparison revealed a high cor- 
relation between these two methods (Fig 3B). The few samples 
where results varied may be due to variations In array fabrication, 
differences in efficiency of multiplex versus unlple.x PCR ampli- 
fication, and, most likely, the pyrosequencing primer design. 
Two of the sequencing primers contained two and three C:|)<.; 
sites, respectively, and generated a blphasic plot while establish- 
ing standard curves. This may have led to bias in the quantlhca- 
fion seen by pyrosequencing. Nevertheless, this highly ilgiiiii- 
cam correlation suggests blsuinte/l'CR/l.OR/Universal Array Is an 
.Ktutale rnclhod lor quantitative analysis uf heleiogeneous clinl- 
Ciil samples 

Application of the method to cancer ceil lines 
and heterogeneous clinical samples 

We performed a bliiideU sUidy lo determine tneihylatloii profiles 
of several canter tell lines (live colorectal, one breast, and one 
prostate) wilh this approach (l-'ig. •)). Ihiec C:p(; sites pet pro- 
moter region were evaluated Methylation was not observed in 
the ptomoter regions of CDKN2H, COKNIA, CUKNIB. mj, and 
BRCAI among all the tested cell line DNAs. These resulis sug- 
gested that promoter meihylatlon Is not a random event. All cell 
lines under the blinded study have very distinct nietliylatlon 
profiles among ihe 15 tumor suppression genes except for two 
pairs of cell line.c HCTIS/ni.D-l and HT29/wiDr, which share 
almost Identical promoter methylation patterns. Each pair ol cell 
lines HCTIS/DLD-I and HT29/WlDr was established from the 
same colon carcinomas (additional cell line infomiatlon is de- 
scribed In American Type Culture Collection |ATCC| Weti site 
hrtp;//www, atcc.org) (Chen et al. 1995). This observation sug- 
gested lhat colorectal cancer cell lines derived from the same 
tumor have essentially identical meihyialion pmflles that are 
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Fl9ur«4, Selecled fjsmplti ol DNA m«thyl«lion profiles ol cancer cell linti jnd clinical wmplts five colorecial (HCT 15 DID ! HTi9 WiDt and 
5W620), one breast (Man. am prostate ((.NCaP) cancer cell lines, iO prirnary cokjfoctal cancer fr series) and 10 adjacent normal tissues IN series) 
weie analyred, 5i« CpC sites per promoter region were analysed in eacti sample except CDKNH, CDKNIA, CDKNIB TPS! and mCA\ Standard curves 
were not eswonsnea rot ine jenes mentioned above since hypermethylalion is not reponed in the liter .iture or obiereed m oui clinical sample study 
The standard curves should \x established when applying this assay to other tumor types such as breast cancer Around IO%-li% established CpC 
rnethylatlon standard curves gave lower R' values (between low BOs and 705), including CD/fW^A (CpC.3 and CpC «) CDKN20 (CpC I) CSTPI 
(CpC 3), DAPKI (CpC.3), RASiF) (CpC.-t), and TIMP3 (CpC-?), Only lour CpC sites ol RARB were shown due to detects on jip-code addresses during 
array labricatitjn The color scale represenU the percentage of methylation levels determined from the standard cun/es at each CpC dmucleotide Notice 
t^'l l .'n.'^J ' "■'^''^'D'. «ch paif is derived from a same tissue origin redected in their identical methylation patterns The 

Kt^K and L13R primer sequences and theit concentrations used in these experiments are listed in the Supplemental Tables 3 •) and 5 



liKilntT ftnm cell lines derived from a diffcrf nt patleiii To ensure 
the auufiii.-yu( scoriiiti a laiidklali.' gerie ptonicite r a.s hypermeth- 
ylated, the methylation stiitus ol three additional CpG sites (ler 
pttimuter wns exsmlnect (examples are shown in Supplemental 
Fig. 2) for the 10 genes thai were found to l» methylated In the 
cell line snidy (C0KN2A, CUKN2U. TIMPIi. An:. KASSf). CDIII, 
MCWT, DAPKI, GSTPI, and RARB)- Within each proftioter, the 
sU CpU sites examined were dispersed evenly throughout the 
anipiiried genoinic sequence Some of the CpG sites investigated 
were within 50 bases and resulted In the design of overlapping 
I.DR primers. Nevertheless, we have found that LDK efficiency 
was not altered by Ihe overlapping primer design (data not 
shown). Thus, the cytoslnes of all 75 CpG sites for ,i given sample 
were interrogated Independently at the same ilrne. 

In a pilot study, we have profiled methylatlon status of 96 
colorectal tumoi samples and 7.i matched normal tissues (se- 
lected diiiB arc shown in Hg 4; Supplemental l lg. 2). Our prelimi- 
nary .injlysis shows a rich variety of methylation profiles Seven 
proniolri regions ICDKSIIA. I'DKM/U. AI'C, KiCMl, HAHH, 
RA'iifl. and VIMPt) showed siatlsiically slgnHlc.iiit increased 
meihylatiiin In tumor cumpared to normal tissues. Promoter re- 
gions of CDKHIA. CDKNiH, rrs:i. and HRCA I revealed little or 
no mrthylatkm Thus, mrihylaliuri prufiles ol clinical samples 
were similar tn ihose observed In Uie uilorerlal ciiicer cell lines 
One ol Ihe LpG dinucleotkles of C0KN2I} wus frequently meth- 
ylated In the tumor samples. The biological signlllcance of Ihis 
methylation remains to be investlgaied by e.xamining additional 
CpG diDucleotldes to determine CDKNIR promoter hypermeth- 
ylation stiitus. Nevertheless, ihese results reaffirm that methyl- 



ation in colorectal cancer is gene specific and nonrandom Ad- 
ditional tumor samples will be examined to provide sufficient 
power to determine Ihe correlation between promoter methyl- 
atlon and the tissue pathological-clinical Information 

Discussion 

We have presented an accurate and quantitative assay that pro- 
vides a representational CpC methylation profile from colorectal 
canter samples, where stromal tell Infiltration Is often seen This 
assay simultaneously determines the DNA methylatlon status of 
multiple gene promoters, querying a total of 75 CpG dinucleo- 
tlde sites per sample. Genomic DNAs Isolated from seven cancer 
cell lines and 169 colon tumor samples were tested. Cell lines 
derived from the same tumor have essentially Identical methyl- 
atlon profiles. The percentage of CpG dinucleotldc methylatlon 
of MOMT promoter In clinical samples was compared by our 
assay with those derived (torn pyrosequenclng and resulted In a 
high lorrelatioii 

The candidate promoter regions, Involving genes in cell 
cycle regulation. UNA repair, and tumor mciastasis ,ind invasion, 
were previously reiKirtcd la the literatures as associated with ab- 
normal gene silencing In tumors or cancer cell lines Although 
the percentage of mcihylaied proiroiers In a coliori may vary 
due 10 Ihe sample source and the assays used In delernilning 
DNA methylation status, out preliminary analysis of ihe colorec- 
tal tumor methylatlon ptuflle gave consistent results as those 
published previously (R.steller et al. 2f)01). For example, in agiee- 
meni with our own Mndings, it has been shown that there Is 
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ejsenllally no hjTxrmethylation In (:DKN2B, m:>, BKCA!, and 
CSTPI promoter legions, while CDKN20, CUKN2A, MOHI', and 
AP<" wert melhylaied lo i higher level among all Ihe samples 
tested Itstellei ei al,, ZOOi). 

Our assay allows virtually any CpG site In the promoter and 
first mtron regions lur mure than a dozen genes to be analyzed 
simiillaneously There Is Increasing evidence that genes such as 
MLHI and RASSHA exhihii an Inireasinfj sradieni ol methyl, 
ailon (mm the promoter proximal region lo the firsi e<on (Denfj 
el al WW; Van ft al. 21)0.11 lo avonl the bias ol scorlrix a hypet 
lot hypol-meihylaieU |)romoter and linking II to lis disease stale, 
nioltiple CpC. silei a< ross a larger window «( the genomic region 
should l>e mvestlgaied in each assay. Studies done with MSP- and 
restriction enzyme- Iw.sed methods only reveal the nielhylatlon 
paitern ol small sequence regions; addlllo/ial sequence conrexts 
may be needed to sufficiently determine the promoter methyl- 
ailon status. Our approach Inlerrogaled multiple CpC dinucleo- 
tides that were evenly distributed ovei 300-500 bases, for paraf- 
(In-embedded tissue, two or more adjacent shorter PCR ainpll- 
cons should be designed to overcome the poor amplification 
rypicslly observed in these types of samples (Supplemental Fig- 3). 
Moreover, the l.DR efficiency wa.s not affected, even when some of 
the LDH primers were designed wttlr overlapping sequences, 'rlilj 
technique provides a detailed mapping of the methylation pro- 
file in each promoter CpG island locus that may correlate with 
transcriptional silencing during disease progression 

ihe bisulliie-fCK/l 1>R/Unlversal Array approach provides 
several advantages over existing methods for Ihe analysis of DNA 
meihylaiion paiierns. I irsi. there are two levels ol specifkliy fa- 
clllidied hv gene specific primers, inllially during Pt'H and sub- 
sequenily during LDK Given (he numerous diipKcatlons anno- 
laied and suspoiicd in ihf human genome, tills approach en 
hiinces the ability lo lell.ibly largel the (.:p(i Islands in Ihe locus 
of iiiteresl Second, i.PH allows ,iccurate ideiilillcaiion ol low 
abundance nucleotide allerailons wlih a remarkable accuracy 
due to the high fidelity of Tlh llgase ll.uoet al 1996), This unique 
feature eliminates Ihe concern of mismatch hyhtldliatlon due to 
partial methylation of internal CpCs of ttie l.DR primer se- 
quences and allows LDH primers to be placed at essenilally any 
CpG dinucleotlde of Interest, Third, the unique zip-code se- 
quences were designed with similar T„, across the platform and 
have no sequence homology in the human genome. This feature 
allows only ligaled l.DR products to be captured, thus avoiding 
background signals. As the Universal Array can be easily ex- 
panded (Gerry et al IWl, addlllimal CpG sites or genes can he 
evaluated In a single assay I ourth, this assay lias the potential lo 
detect low abundance methylated alleles, By redesigning the 
inulliplex gene-specini. I'LH primers to be methyl specific, a de- 
lecllon of ai least 0 I 'Hi was achieved, albeit with reduced quan- 
titative dynamic isnge (Supplemental I'lg, I) An Inherent prob- 
lem with many DNA ampliflcailon techniques Is thai greater de- 
tection sensitivity comes at ihe cosi of Increased false puslilves, 
Allhough melhylalion assays relying solely on Mt'l! as a readout 
tool offer sensitive detection, Ihey are prone lo false positives 
resulting from the AT richness or incomplete dejminatlon of 
btsulflte-treaied DNAs- Consequently, such assays would be lim- 
ited tn their multiplex capabillly. Out assay confirmed promoter 
methylation status via sU CpC sites within a PCR ftagment; this 
approach offers high specificity and accuracy while avoiding 
false positives Moteover, each module of the I'CR/LDK/Unlversal 
mrcroariay approach is Ideal for iniiltiplexiiig and lan be auto- 
mated by using a liquid handing system tn increase ihroughput 



A recent publication has repotted the detection of dojens to hun- 
dreds of possible mutations by using multiplex I'CH/LDR In a 
single-tube format (Favis et al 2004). The capture of multiplex 
LPH ptoducis onto an array formal provides an efficient "modu- 
lar' readout and substantially Increases assay throughput Uin- 
vetsal mictoarray experiments in our laboratory are now per- 
formed In an array-ol-arrays formal, where 6-1 array hybndlza- 
lions are carried out limuliancously Idala noi shovfli) I hh ,irray- 
"f arrays approach drastically reduces the cost of array 
fabncaiiun, itilriiniiies the variation during hybndiiaiion. and 
increases ihroughput 

In summary, we present a robust and accurate method thai 
determines cyiosine meihylation at any selected sei of CpG dl- 
nucleotides in the genome, importantly, ihls new method allows 
the evaluation of methylailiw level at Individual CpG sites, 
(iuantltative values for this parametet may facllltaie stratifica- 
tion of tumors, since based on the degree of methylation, it may 
be possible to estimate disease progression In addition, the abil- 
ity to quantify methylation at specific sites In advanced tumors 
may provide Information on tumor heterogeneity. These data 
Mdll enable clinical decisions related to Individualized treatment 
strategies. Out goal Is to expand this prototype assay inio a fo- 
cused array platform that Investigates 30-50 frequently methyl- 
aied tumor suppressor promoters observed In several different 
human carcinomas, Around 10-15 Individual CpG dinucleoiides 
evenly distributed over a larger window of each promotei leglon 
will be interrogated. We anticipate ihat such a focused platform 
will facilitate the developmeni of DNA-based molecular markers 
lot disease diagnosis and prognosis and will be suiiabic tor rou- 
tine clinical use 



Methods 

Cell line culture, tumor samples, and DNA extraction 
Normal human lymphocyte genomic DNA was purchased from 
Koche. Colorectal, breast, and prostate cancer tell lines were ob- 
tained from American Type Culture Collection and cultured un- 
der the Al'CC-recommended media conditions Fresh frozen pii- 
maty colorectal adenocarcinomas were obtained from Memorial 
Sloan Ketlcting Cancer Center under Instirullonal Review Board 
(IRBl-appruved protocols Genomic lINAi were extracted by us- 
ing Ihe DNeasy Tissue Kit (Cllagen) according lo the manufactur- 
er's guidelines. 

Sodium bisulfite ireacmcni of genomic DNAs 
Typically, 1 ug genomic DNA was denatuted In 40 \il of 0 2 N 
NaOtl by incubating for 10 min at .37'C before addition of 30 pi. 
ol freshly prepared 10 mM hydroqulnone and 520 pL of 3 M 
sodium bisulfite The reaction was Incubated for 20 min at .SO'C, 
15 sec at «5'C lot 48 cycles (16 h). The UNA clean-up procedure 
was as follows: (1) the total reaction volume (-600 pL) was trans- 
ferred to a Microcon NCO30 filter (Milliporei and ceniriluged ai 
1 3000X for 16 min; (2) 500 ut deionlzed HjO were added to the 
upper chamber, centrlfuged al I.!,00O,«; foi 7 min, the filtrate 
discarded, and the wash tepeaied twite; (3) SOO pLof 0 3M NaOH 
wete added to the upper chamber, incubated for 5 inlii ai room 
temperature and then centrlftiged at 13, (KK)^ for S min; (4) SOO 
pi. ilelonlzed H^O were added to the upper chamber, centnfuged 
at 13,00(l,« for 8 min, the filtrate discarded, and the wash re- 
|>eaied; and ISI the filler was inverted to collect Ihe bisulfite- 
converted DN'A An appropriate volume of waier III needed) was 



6 Genome Research 

www gfimnu' OIK 



Multiplex deiecrlon of CpG methylstlon in tumors 



m«J lo rinse ihf upper chjniher tri rerovet DNA In a final volume 
or 20 wl 

Multiplex PCR amplincailon 

The mglllplex PCR cunshls of Iwo stages, PCR stage I (12.5 (jl J 
rontJiiied I S (iL bisul/lte-mmlifted UNA, 400 jjM of fuiti dN'l l', 
IX AmpliTaq Gold PCR bullet, ■) niM MglJIj, and 1.25 U 
Amph l iiq Gold polyrnunise (Applied Blosysiems) Mineral oil was 
added prtot lo thermal cycling. I he PCR stage I condlilon.s were 
as follows: lU min at 95"C: IS cytles ol JO sec at 94*C, I min at 
6(l'C, and I miti al 72 'C; followed by a final extension step of 5 
min at 72*C. PCR stage II (12 5 pL) contalnetl 400 pM of each 
dNTP, 1 y AnipNTaq Gold PCR buffer, 4 mM MgCI,, 12. S pmol 
universal prlnner (UntB2, see Supplemental labie I), and 1 25 U 
Amplllaq Gold polymerase The 12.5 pi. reaction mature was 
added through the mineral oil to the completed siage I PCR. The 
I'CH stage II tundliiuns were as follows 10 min ai 95'C. :|0 lycles 
of .10 sec ,11 y4"C, I min Jl S5'C, 1 min at 72'C: lollnwed by a 
fiii.ll extension step of 5 min al 72'c /«</ DNA polymerase was 
iiiaciivaied by addiiiR 1 .25 (d Proieinase K iZO ing/ml„ Qiagen) 
lo the completed stage II I'CK, intiihating lot 10 min at 70'C and 
15 mm ,11 90'C. ftefore pooling the PCH products for \m assay, 
Ihe presence of anipllcons wa.s confirmed by eleitrophoresis on a 
yX) agarose gel 

LDR, Universal Array fiybrldizatlon, and datii analyses 
A typical LDR (20 pL) contained 20 mM TrIs-lICI (pit 7.6), 10 
mM MgClj, lOO mM KCI, 10 mM DTT, I mM NAD, 25 fmol 
wild-type Tth llgase (Zlrvi et al. 1999), 500 fmof of each f.DR 
primer, and 5-IOiig of each PGR ampilcon, The I.DR conditions 
were as follows; .1 min at 9S'0, 2S cycles of 30 sec at 95'C and 4 
min at 60*C. The LDR reaction was diluted with an equal volume 
of 2x hybridization buffer (M)0 mM Mf-:S at pll 6,0, 20 mM 
MgCI,, 0 2% Sl«). dennrured for .1 min at 9S'C and plunged on 
Ice. The Universal Arrays were pre-equlllbraied with I x hybrid- 
ization buffet at room lemperarute for at least 1 5 min Coverwells 
(Grace Rio-labs) were altached to arrays and filled with 40 pi. 
denatured l,L)H reactions I he assembled arrays were incubated 
111 a roiniing liybridlzsiion oven for 60 niln at rtS'C. After hy- 
lirldlzatlon. the arrays sveie washed in .^00 mM Ricine (pll HO). 
0 rXi SIW for 10 min at SO'C An ufidated version wiih J84 ad- 
dresses will accommodate all the I DR pnxlucts F.ach array was 
scanned by using a Perkin timer I'roScanAriay under the same 
laser power and PMT within the lincai dynamic range. The c:y:) 
and Cy5 dye bias was determined by measuring the fluorescence 
intensity of an equal mole of Cy.f- and CyS-labeleil I.DR primers 
manually deposited on a slide surface. I'his fluorescence intensity 
railo (W » i, ^,/|, was used to normalize the label bias when 
calculating the methylation ratio Cy,l/(CyJ « CyS). McinMorph 
Imaging System (Universal Imaging) was used to create Images 
depicting the Cy.i (red) and the CyS (green) 

Oligonucleotide design and synthesis 

Oligonucleotides were obialned from lUT or synthesized in- 
house on an ABl .191 DNA Synthesizer (PK Biosystems) using 
standard phosphuramldlte chemistry (Khanna et al. 1999) 
Spacer phosphoramidlte CIH, .V-ammo-modifler C.^ c:PG, CT 
spacer, and Cy3, CyS, and standard phosphoramldlles were put- 
chased from Glen Research All oilier reagents were purchased 
Irom I'K Biosystems The zip-code oligonucleotides were synthe- 
sized on a .i' amino modifier C.I column with a spacer cilS in- 
serted before the tint base The cimimon l.UR primers were syn> 
theslzed with S phosphates and T' Ci spacers as blocking 
gn)u()s. Oligonucleotides with cyanlne labels were cleaved from 



the CPG supports and deproltcied according to manufacturer's 
lecommendatiuns. Both labeled and unlabeled LDR oligonuclvo' 
tides were purified and desalted on SuperPure columns (Bio- 
-search Technologies) according to the manufacturer's insiruc- 
Hons, then spin-dned (Speed- Vac) and stored at -20'C. Tor 
those primers that inevitably covered CpC dinucleolldes in the 
body of their sequences, the nucleotides that base paired with 
rytosines In C:pG dinucleoiide were synthesized In two ways. 
One was to use nucleotide analogs dK or dP in the primets' syn- 
theses Ihe pyrlmidlne derivative dP base paits with either A or 
G, while the purine derlvailve dK base pairs with either C or T al 
similar efficiency. Alternatively, to reduce the cost of primer syn^ 
thesis, those nucleotide positions with analogs dK or dP Incor- 
poratcd were substituted by nucleotides dC or dC, respectively 
for example. Ihe substituted nucleotide dO In a PCR primer 
lormed eiiher Watson-Crick base pair with C (methylated) or 
wobble base pair with U (unmeihylaiedl on ihe blsulfite- 
modlfled DNA template 

Universal Array fabrication 

Polymer-coaled slides were fabricated as previously described 
(Gerry el al 1999, Tavls et al 2t)00) or were purchased (Codel.ink 
sllde,s) (rt)m Amersham Biosciences, Universal Arrays were spoi- 
led by using a I'ixsysSSOO robot with a quill-type spotter in a 
conirolled humidity chamber (Cartesian Technologies), Zip-code 
ollgonucleotldei each with a unique 24-mei sequence weie pre- 
pared by mixing 5 pi. of UXX) pM stock oligonucleotides with 5 
Ml. of 0,4 M K;HPO,/KH,PO, (pH 8.5) m 384 conical well spoi- 
ling plates Arrays were printed under relative liuinidlty MY)h~ 
70'lfi. To ensure that all Ihe zlp-codcs wete spotted without cross- 
contamlnailon duttng array fabrication, one out of 10 slides on 
average was subjected to quality control by hybridizing fluotes- 
celn-labeled zip-code complements targeting a combination ol 
tows or columns of zip-code addresses, A batch of fabricated ar- 
rays passed the quality control only when specific Ouuiesceln 
signals were present on all the targeted rows and columns with- 
out extraneous signals on the adjacent, unexpected neighboring 
addresses 

Pyrosequencing 

A promoier sequence of MGMT was PCR aniplilled by using 1 pi 
bisulliie-inodified DNA, 400 pM ol each dNl P, I x AmpliTaq 
tioiri PCR bufler, 4 mM MgC.I,. 0 2 uM Pt R primers (5 - 
GGTrnAGGAGGGGAGAGATT..) and 5 .CCTAAtiCCRA 
Al AACCCriC-.i (, and I .2S u Ampll laq Gold polvmeiase (Ap- 
plied Biosystems) The I'CR mndiiion was as follows- IS min ai 
94't^, 45 cytles of IS sec al 9S''C:, .iO sec ai Sti'C, and 15 sec ai 
72*C; followed by a final extension step fof 5 mm al 72'C I'hrec 
sequencing primers (5'-(iTAt,;TAGnTAGA(;TAGc;AI -3 , 5 . 
r rn AGAGAG IT ITTAGGA 1-3' and S'-AAATI AAGGTA I A 
GAG TTTT-T'i were designed to determine ihe CpG dinucleoiide 
meihylation levels. The primers were designed and experiments 
were performed by flioiage 
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Detection and Serotyping of Dengue Virus in Serum Samples by 
Multiplex Reverse Transcriptase PCR-Ligase Detection 
Reaction Assay'*'! 

S, Das.' M. R, Pingle,^ J, Munoz-Jordan/ M. S. Rundell,= S, Rondini,' K. Granger,^ G,-J. J Chang' 
E. Kelly, E. G. Spier," D. Larone,'' E, Spilzer,'* F. Barany,^ and M. Golightly'' 
D<'pamvm of Mi-^cine Dv^mm aflnlmmrioml Mef uw and hfnnom Diww ' Df/junmen, ofMKrvhiobg^ and Invnunolm ' and 
L^irnren, of PaiMosy and Labuwtory Mcdkme: mil Medical Colhge of Cornell Univemly, Nc^ York, and Depannu'n, of 
Fach^. Siom^ nrvok Uimmuy Medical Cemfi. Hwny Bmok,' New York; Center, for DtK-fttf Conirol and hrvennon[ Son Jwrl 
I wm> Kim Cmim for Dumvc Coniml and Prevention, l-on CoUms, Cohmdo*. Waller Heed Army Imiuuu of Heivarch, 
Silver Spnng, Maryland^; and Applied Siosyaems, hoiVr City, California" 

RctcivcJ 25 Jrtno,iry 20l)8/Ri;iurnciJ (or mudilivaliun I'l June 21XJ8/Aaepicd 2(i July 200S 

The dtlection and successft.1 typing of dengue vlnis (DENY) from palienU with suspected dengue fever Is 
Importanl both for the diagnosis of (he disease and for the Implementation of epidemiologic control measures. 
A technique for the multiplex detection and typing of DENY serotypes 1 to 4 (DE^fV•l to DENV-4) from clinical 
samples by PCR-ligase detection reaction (LDR) has been developed, A serotype-specific PCR ampllHes the 
regions of genes C and E simultaneously. The two amplicons are targeted In a multiplex LDR, and the resultant 
Huorescently labeled ligation products are detected on a universal array. The assay was opiimiied using 38 
M 7^ Z^l' rir"'"'™''.:? """^Ph*'* »*n,n, samples. The sensitivily of the assay was 
98.7%, and its specificity was 98,4%, relative to (he results of rrtal-tlme PCR, The detection threshold was 0 017 
PFU for DENV.I, O-MMPFU for DENV.2, 0.8 PFU for DENVO, and 0.7 PFD for DEwTThe assJ iH^^^^^^^^ 
It does not cross-react with the other flavlviruses tested (West Nile virus, St. Louis encephalitis vinis, Japanese 
encephalitis virus, Kunjin vints, Murray Valley virus, Powassan virus, and yellow fever virus), All but 1 of 26 
genotypic varlanls of DENY serotypes In a global DENY panel from different geographic regions were 
succmft, ly tdentitied. The PCR.LDR assay is a rapid, sensitive, specific, and high-throughput te?hn"queX 
Ihe simultaneous detection of all four serotypes of DENY. » k >^ H"'^ 



The dengue virus (DENY), a mosquilo-boriie f1avivir\is, 
consists of four I'loscly related bul genetically distinct antigenic 
.scrorypcs: DENY serotype I (DENV-1), DENV.2, DENV-3, 
and DENV-4, It a tropical and subtropical in distribution and 
IS prevalent in Asm, Africa, and Central and South Amcricii 
(45). Infection with any of the four .serotypes of DENY may 
cause a nnild febrile illness, dengue fever (DF), In some cases, 
however, more-ievere manifeslation.s. such as dengue hemor- 
rhagic fever (DHF) and dengue shock syndrome (DSS), occur; 
these may pnive fmal without proper early intervention (15). 

Geographic spread of both the mosquito vector and the virus 
over the past 25 years has led to Ihe increased occurrence of 
epidemic DF/DHF/DSS, making dengue a major global health 
problem. The disease is endemic m more than 100 countries, 
wiih an csiimatcd 2 .S billion people ai risk uf mfetiion. It is 
estimated thut 50 million DENY infections occur each year, 
with 5(K).(KK) Ciiscb of DHF and at least 22,000 deaths, mainly 
in children (.^1, .12. 45; WHO/WPRO/SEARO meeting on 
DengueNet implementation in Southeast Asia and the 
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Western Pacific, Kuala Lumpur, Malaysia. 1 1 lo 13 Decem- 
ber 2003), 

DENY infection confers lifelong serotype-speeiHc immunity. 
Multiple infections with different DENV serotypes occur in 
regions of hyperendemicity {7,\, 35). Secondary infections with 
a different DENY serotype arc major risk factors for DHF and 
DSS (13. 14. 39) due to antibody-dependent enhanccmeni of 
disease (35). Serotype identification and the difrerenliaiiun nl 
primary and .secondary infections are therefore imporiani boih 
for patient mHnagcmont iind for the implementation of public 
health measures (2b, 33) 

The diagnosis uf DENY infocimn and the typing of DRNV 
scrolypes can be cunfurncd using viral isolation lechmqiics. 
■serology, or molecular methods. Yirgs i.solaliiin is Ihe gold 
standard lor detection but requires 7 to 10 duys and is often 
insensitive (26). Serological tests for the detection of viral 
antibodies, such as immunoglobulin M and immunoglobulin G 
antibody capture enzyme-linked immunosorbent a.ssays, re- 
quire the demonstration uf a rise in antibody titer from an 
acute-phase to a convalescent-phase serum sample and there- 
fore have little impact on patient managemeni (24, 41). Addi- 
tionally, the extensive antigenic cross-reacliviry in serological 
assays, both among flaviviru.ses and between DENY serntypes. 
further complicates definilivc diagnosis and ihe interprelaiion 
of the a.ssays (18, 20, 41). 

Molecular techniques based on the detection of genomic 
•sequences by reverse Iranscriptlon-PCR (RT PCR), nested 
PCR, and real-time PCR are rapid and sensitive and have 
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rcplnctU virus i,solaiii)n as the new slandard melhod for ihe 
dcicciion uf DENV m jcute-phasc scruin samples (15). These 
methods identify Ihe lout different serotypes by using genus- or 
scrolypc-spetific primers or a combination of both A two-sicp 
nested RT-PCf^ approach is routinely employed in laborato- 
ries worldwide (27). 

Although most molcculttr techniques hnvc the advantage of 
being rapid and sensitive, there arc limitations. Real-time as- 
says are limited by Ihe large number of reactions required, as 
in Sybr green-based assays, and the manipulation of samples 
necessary for serotype identificBtion is a limilaiinn nf fluoro- 
genic dye-based assays (20, 26, 29, 44). In addition, the genetic 
diversity among DENV isolates rai.ses concerns regarding 
false-negative PCR results due to mismatches in sequences 
resulting from the continual evolution of variant viral se- 
quences. For example, recent studies indicate the pre.sence of 
well-defined phylogonetit groups within each seroiype of 
DBNV The genotypes described within the different .serotypes 
arc bused on sequence variations in gene E and NSI. The 
number of genotypes varies, ranging from three (for DENV-4) 
10 live (loi DRNV-I, -2. and depending on the region 
sequenced (19, 11). 

The present study was, conducted on clinical samples from 
Puerto Rico, which is regarded as a prototype of the urban 
esiablishnicnl iif DENV. During the past 2 decades, Puerto 
Rico has experienced irrcrea-singly severe DENV epidemics 
(4). All four serotypes and multiple genotypes have been m 
circulation on this island. This situation is considered ideal for 
the genetic evolution of the virus, which has been demon- 
strated by molecular analysis of DENV-2 and -4 (.1, 4), Tire 
situation IS e-tpected to be further complicated by the recent 
introduciion ol West Nile virus, which may clinically mimic DF 
and is dillicull In ilistinguish Irom DENV by serologic tcsl,s. 
due to cross- reactive antibodies (7, 25). 

In this study, we combine multiplex PCR using multiple 
degenerate primers with a ligase detection reaction (LDR) and 
a universal zip-code array for the simultaneous identification 
and serotyping of DENV from viral cultures and clinical sam- 
ples from Puerto Rico. Originally developed for discriminating 
single-base mutations or polymorphisms in cancer genes, LDR 
uses a thermostable DNA liga.se that ligatcs rwo adjacent oligo- 
nucleotides annealed to a complementary target only if the 
nucleotides are perfectly matched at the junction (1, 2). This 
method has subsequently been used in detecting mutations, 
insertions, and deletions m cancer genes (R-IO, 21, 22). More 
recently thi.s assay has been adapted for the detection of bac- 
terial pathogens in blood cultures (34) and of West Nile virus 
in serum and mosquito pools (.IS) Since even a single base 
mismaich at ihc ligation junction prevents successful ligation, 
Ihe iccliniquc is highly specihc (21) Furthermore, such assays 
are ideal for multiplexing, since several primer sets can ligalc 
along a DNA template without the interference encountered 
in purely polymerasc-based assays (10, 22) 

The LDR primers are designed to produce ligation products 
that are lluoiesceiiily labeled at iheir V ends and have zip-code 
complements (complementary to 2ip-code addresses in u uni 
versal array) appended to their 5' end,s. The specific zip-code 
address spotted onto the array hybridizes only to the comple- 
mentary sequence included on the LDR product (9, 10), A 
schematic representation of the assay is shown in Fig. 1. The 



universal array is a powerful technique that permiis the simul- 
taneous detection of a large number of genes or gene products, 
making il ideal for use in multiplex, high-throughput u.ssays 

Thus, the unique specificity and scnsiiivity of PCR-L.OR 
coupled to the specificity of the universal array enables ihe 
detection and differentiation of all four serotypes ol DENV m 
a single assay. 



MATEIUALS AND MBTHODS 
Vlml cullurti snd RNA from vlru»cs and ctfiiipat fsmple.f, ViihI culiurc 
supcrntlams wcfc olilaincd Irom ihc Dtligut Hranrh, Ccnltrj fisr DitciV Cm 
irul iniJ Provenlkin (C'Dt'l, S»n Juiin, PiicMo Rici) in - JS) ITicsc intliiUfd HI 
liillHles cacti Qf P(.;NV. I ani) ■^ lind « isolaics facb ol Qt.NV 3 jrnj 

tJENV Jiraini seltcled Horn ihc glolnl DRNV ijancl (6) rnainumiJil al ll'c 
Division uf VcilDr-Bornc InKciiuus DiscasM (DVBID). CDC. Karl C ullim, CO 
*crc used (7abl£ 1). Ilie paniil ujnsitis of unique ilraim uf DFNV I ig J 
rcpicscming Ihe loiesi i.^DliHe.i iif cacti serotype ftdm diDctciit j;ci)jir,iptiic fc 
ginni litilaies u( nihcr llavvifuscs (Ubh 2\ were kindly iinivi.led hv l(,.l.eri 
l.jiKioili at thf \}vmt>. Cl3i:, the Wurld Ki-fcrentc tcnu-r lur I mcrninj: 
Virg,4csarid Aihflviiuses SI the UiHvciiit> ol Tcias Medical Hiimb in (,,iIvcmiwi 
M ^tcdng fiiim ihe Ruben Kiith tnsiiiulc. Berlin, (ic-rnuny ftiimpean Nil 
work ttir l)i;tgni>*lif.s uf Imported Virtil Uiseiijcsj, iind the New Yoil, t nv 
t>cpartincnt ol l-lcHllh (.IK) 

A loijl of -ISll clinical urnim kiimplts icslcd in itiii iludy were nhitiiiicil limn 
Ihc leposiloiy i.t the Uengiie branch, CDC. San Juan, Hucnu Kiui where thcv 
wcic ciiiifirnied a» eilhcr PiKntive ui iicgalive lyr Df-NV fiy icrti-limt Pt K Ih) 
l^cc sample! *vre ctilleticd from puiienis wiih iuupcetcd fji Uurinn epide iiu-\ 
jnd in the imraep/demic periods; correJipundinB with the ytafi when ihe diltcitnt 
semrypcs circul.ltcd in the rcjlim IWrt in 1W5 for DliNV I. mn In M'lf l.ii 
DENV-2. HW to 31XM Inr DCNV-l, and WS lo IWH loi Df-.NV < The .crum 
specimens were lOlleilcd fiiim 2S diftcrenl municipaliliei in Puerm Rich 

RNA eslmctlun. Total RNA wai culracled from 141I ul nf human leium 
wmplcs or Kirus mtecied itssuc culture lupernaianls hy usini ihe yiAamp nr,,l 
UNA kn (Olagen. Inc.. Valencia, CA) auairdlng lo the manufacluier » inMiin 
lion). The KNA wa-i curacted in fiU ^l of clulion huffer and was u«d irninvdi- 
aicly to sjmihcsijc the (irsl-iliand cfWA wilh ihe Supcrttnpt hui tirand \yr\- 
ihenu syiitem for KT-PCK (Invtlrogen. i: arlshiut. t.A) accoidinj; lo ihc 
minuiaclgrcr s protocol. Hncfly, a master mil consist ing iitl) 1 mM defiirynm-lfiv 
iidc rrlplMsphsles, 150 ng of random hciamers. S mM MgO,. : ^1 of UK hulier 
10 mM dilhiolhrcilol, -lU U of KNweOU'l rccombinnni KNajc inhihiiin, .iml >li 
U of Superscript fl reverse lran«ripia.sc enzyme wai, prepared A fj-txl uliquoi ui 
e^lraclcd RNA was added (o ihis mi^tlurc, and itic sample was mcufialcd at 4:'C' 
foi Sa mm, followed by 15 min al ^B•C to icimmale Ihc reaction. ,\n adilitwnal 
incubation of 20 mm a( 37Twllh I |il of RNasc H eliminated any residual RNA 
in Ihc rwllon producl The rcsolmnl cDtVA w»j stored at -ZtfC for future u>c 
Primer <le<l|ll and •i»y devlloptneiil. PCR prioKrl were itesigncd using 
Oligo6 0Klfrw«re (Molecular Biology Insighn, Cascade. CO) and were targeted 
10 Ihe capsid protein (CJ and envelope protein (E) regions of the UfcNV 
genome. In the original version of the assay, nesicd forward and reverse pninen 
were designed for ihe aniplilicaiion ol one region m gene C and iwo rcgion.s m 
gene E Relatively conserved icgionn of the genome were chosen afler alignmcni 
of known DliNV sv<|ucnce« accessed from Gen Bunk lor pnmci design, but du,- 
10 the cstcnsive sei|ucnte variation in Ihe DENV gcniinic, didereni cciv nf 
primers had lo be designed lor each JCloiypt'. and degenerate hases were in 
eluded to accommodate vanaiiun wiihin each seioiypc Primvn had ineiiinn 
iciiipeiaiuies ol appioniniaie (y 72 to 7S"r and were dcsiencd such itiai the e 
were no mure Ihan ihice degenerate positions in each pnmcr Univcisal tail 
sci|ucnces were appended lo the .V ends of forward and rcveise PC K prmivrs lo 
pieveiil Ihc fnrmaliun of primer dimers A lotfti nf ^i P( I) pi, men wcic dc 
Signed lor all four serotypes of Ul-.NV (sec laNcs St and SJ m ihe supplcmenul 
material) 

l.DR pinners were designed al three posilinns wiihin each ,11 ihe ^cnc [ 
amplicons and iwo positions wilhin ihe gene C amplittui Ihc upslicjm I. OK 
primers had unique otigoiiuclciHidcs (7ip-codt coniplemcnisi, :il nases Ioiik 
attached »l the 5' end: the downslreim primeis hid a llunrcsccni label, cnlie- 
6.carhoeyl)uoiesiBin If- AM) ui i7unmc i ((:y^). al the ! end A total ol lis 
l .DR primers were designed (sec t ables S;i and S-t in the supplemental n.,iu 
riAl). with melting temperatures of 75 10 .SOT: degenerale hases (nn mnie ihjn 
three in each primer) were inlrcxluccd whete required to account liir setiuciicc 
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f Kp I Sthcmaiic ,.| ihc H( R.l DR at(3v for ihe idcniir.faii.m of DENV Kroiypcs Serorypcspccific PCR prrmors amplily nnc vegion i-f Ltnc 
t ""^ ''ni' rfgion .,1 gcw I (lor flamy, only Che- gtni- E umplicon is shuwn) Wiihin fach PCR ampliain. LDR primtrn aip dc.«nfd li. nji.-iiliry 
.mil tl.lTcrcniwi,- ihc lour Uiffat-ni DENV ic.oiypc*. The l.DR primers ur^cl two Uw.iions in gene C »ny ihrcc liK-alions m gtnc t Ihf v 
up»m-,in. UJK primers hoar j.p-cndc complcmenis, while (he .) dnwnslrciim LDR primers Have eilher a FAW or a Cy^ Huorcsceni lahcl Lig4uon 
"I ihe I IJK primers resulum flijoicsixmly luhe led produces of ditrereiii lenglhs iti.ii nrc Ihcn delecied eufier by CF or on 3 universal .iitav NTR 
nnnlrar>slalcd region 



»ari«(i"n« immi differcnl sirjms nf itic virui Primers were iiblained tram 
Inieij/alcd DNA Icchnolugiei (Ciualvillc, lAJ. 

PCR-1.0R irnd delKKm olprMjucii. dene C and jenc Ewerc aiiiplilicd 
uMng Kniiypc spciifti primers The HCR muiure eoniuled 01' 10 mM Trii-HCl 
buffer Cdniaming VlmM KC l (pH K.O). J 5 mM Mga,,0,8 mM deonynuclcotidc 
(iiphn«ppoict. and I.J5 U of Ampliroi; Gold DNA polymerase (Applied Bio- 
sysiems, poslei Cily, CA), 

Hl'Ki were iipiimized by ncrlurmins jcruiypc-ipesifte uniplex rcaeliuns using 
DPNV cuhure supernauncs wirtt > in 1(1 pmol of each primer per reaclion, To 
ihis mKlurc. I nl M lemplale tUNA was added lu make a linal vulome ol 25 ^1 
Aniplilic.iunii W3.s pcitoimed using a freneAmp 97fl(| Itiermocycler (Applied 
Hnisy«leiiH, Foilcr C'lry, CAI Ininal denaluralion of lemplalc ONA was 
athicsx-il tsy he..iing ai VST fur < mm This *a» followed by 40 eyelei of .XI ) »l 
y5't .lu 1 .11 . and I mm at 72'i ' A linal exicnston step wai conducied for 
7 mm rti 7?"C, folltrs^cd hy a Icrminalinn iicp al 9VC for .V) min 

Mudiplcs PCRs were perfurmed ulinj Ihe same method, excepl dial all 
primers lor all Kroiypet were used m a single reaclion, Mulliplcx reaciioni were 
opiimiml Py varying ihc amcenUaiions of ihe primer (10 lo 0.5 nM) and MgCI, 
(I 5 lu 2 5 mM) Uliimaiely. Ihe oplimum reacinm condilions ivcrc found lo he 
similar lo ihyie dcscnPcd above, esccpl (hai Ibe final ctmcenlration of eacTl 
primer used was t i^M and ihc MgClj euneemraiion was mM 

l.igalion reaclions weicconducledm a soluiinn (2(1 nl)conl«inlng I.DR huHei 
(3(1 mM I ris-HCI huJcr |pH 76|, IIX) roM KCl, II) mM McQ,). I mM NAD', 
I mM dilh.oihreiiul. \lf nM (2511 fmol) tiich LOR primer. 2 iil of eaeh PCr< 
pioouel. and 0 III nM AK ISO ligase (espresscd from Iht'imi species AK IftlJ) 
142) Heatuon mmiiles weie milially healed ai li'V lor I mm. folluweil by Jl) 
Ihermal cyvles ai Va'C lor UJ s (denaloi alion) and (i4"(, loi J mm (annealing* 
litrtlioni ttic tijtaiion pruducis were analvrcd hy iwo meihods capillary cict 
irophorcsis iC'i-) and H universal rip.ci>dc array 



CE. A (15.^1 aiiquOi of each LDR prodvct was added lo V2 wl ol lit !)i 
formamide and 1) J nl of a l.l/d 5IX) UNA si;c siiindaril (Applied Hiusysicms, 
hosier (Jlly, CAl The samples were denalurcd by heaimg lo "JVC fm ,1 mm tna 
were cooled rapidly lu 4'C before being loaded onlu ihc Adl !7!IJ QlvA aiialyzcc 
fur CU- rhc dala generated svere analyv.eil using Gene Mapps-f soflwaic (version 
.i.S; Applied Bimyslems. Fnsici Cily, CA). The fragmcnl sijc and peak area daia 
of die dilTercni ligaiion produels were caponed and used logencralc a 2 dinieii. 
Jional viriual gel image using ihe Oclicndcr sofiware program (34) All reanioiis 
were performed in duplieale. and ihe espenmeni was run twice 

UnlTcraal ilp-eode arniy. UniHue 2l>-bp uligonucleoiides (zip code jddrtsstsi 
were double spoiled onio polymer -coaled slides (CiideLink slides, (it Heallh- 
care, Piscaiawsy, NJ). Zip-code addresses for spoiiins were prepared in M) mM 
sodium phosphate (pH K.5) at a final conccnlianon of 25 jiM in a JBa-well plate 
A tidueiid oligonucleiHidc (1 liM) was added lo the pi inimg ini» in eath well and 
cnspoiied wiih each ?.ipcode addrcs.s. Arrays were printed using a OArrayMiiii 
robolic array primer (Gcnolia, Boston, MA) al WQ and 50 to OUfi humidity 
Primed slides were incuhaled m a saluraled NaO chamber ovemighl and ihcn 
ircaicd wiib a hloeking soluiion (0.1 M Iris, .50 mM cihanolaminc |pH » «|) to 
block residual reactive carbosyl groups The slides were washed with 4 ^ standard 
sodium eilraie (SSC) hurcr (20x SSr « 1 M .«idium chloride and II 1 M sodium 
cilraie; pH 7 0) and 1)1% losllum dodecyl sullaie (SOS) anil were ilricil hv 
spinning A ft earbosy-X-rhodaminc labeled hducill complement mcliideil m Ihe 
hybridisation mixture served as an mlernal puxilive conlrul lo determine ibe 
position and quality of each address. Primed slides were randomly selected fui 
duality control by hybridizing lluorcsceiil. labeled Zip-code complements a t),iicli 
of slides WIS used only il ihe quality control prixluced o specific lluoicsccni sign.il 
111 the absence of extiaiicnus signals on adjacent addresses Printed slides wetc 
stored in a desiccatni al icsom lenipefalurv until use 

l.igalion proOucls were dilulcd in a hybridization hultei (5» SSC huBcil 
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TABLE I. Global DENV panel used in assay validmion 



Sample ID 



Origin 



Yt of 
isolarion 



Source 



Sgrorype 



Gcnolypc 



27f)RKI 
498 RKI 
1266 
12150 
22«69(l 

1 116 74 

UCI02,''i4 

P8I407MS 

S-4002 1 

BC77'9h 

BCI7I/96 

BCI41/V6 

S- 1 4633 

BC182/V6 

BCI4/97 

MK 594-87 

S-4(I5H(1 

BC18S/97 

271242 

BC 1 23/97 

DKJ.III9 

BrU/')7 

BC25N/97 



In(Ji!i 

Thailand 

Indonesia 

Ptiilipptncit 

Jamaica 

Cos la Rica 

Uakar 

Saudi Arabia 

Malaysia 

Myanmar 

Vietnam 

Philippines 

Bolivia 

Puerlii Rico 

Tonga 

Philippines 

Malayiia 

Thailand 

Myanmar 

Metico 

Sri Lanka 

Malaysia 

Thailand 

Malaysia 

Mevico 

Pucrio Rico 



1997 

I99S 

1978 

1984 

1977 

1994 

1970 

1994 

1968 

1976 

1995 

1996 
1998 
1994 
1974 
1997 
1997 
1987 
1976 
1997 
1991 

Unknown 

1985 

1997 

1997 

1994 



Human 

Human 

Mosquiio 

Human 

Human 

Human 

Human 

Human 

Human 

Human 

Unknown 

Human 

Unknown 

Unknown 

Unknown 

Human 

Human 

Unknown 

Unknown 

Hitman 

Human 

Monkey 

Human 

Human 

Human 

I luman 



DENVl 

DENV-l 

DENV. I 

DENV. I 

DENV-l 

DENV-l 

DENV-2 

DENV.2 

DENV. 2 

DENV.2 

DENV-2 

DENV.2 

UENV-2 

DENV-2 

DENV-2 

DENV.3 

DENV-3 

DENV.3 

DENV 3 

DENV.3 

DENV.3 

DENV. 4 

DENV.4 

DENV.4 

DENV.4 

DENV.4 



Genotype I 

Genotype I 

Genotype II 

Genotype II 

Genorypc III 

Genotype III 

SylVHiic genotype 

COTmopolilan/genoiypc 111 

Cusmopolitiin/gcnorypc III 

Asian genotype I 

Asian genotype II 

A.sian genotype II 
Amcrtcon/Asian 
American/Asian 
Amerivan 
Genotype I 
Genotype I 
Genotype II 
Genotype II 
Genotype III 
Genotype III 
SylvBiic genotype 
Oenoiypc I 
Genotype I 
Genoiypc II 
Genotype II 



.onijinmis II I'Ji SUS t) I mg/ml salniun sperm DNA (f isher SwnnrH). anil 5 
nM hJucial cnmplcmcnl in i inul vnlumc o( W ^1 A Prti-Plale mullnrray (hde 
chamber (Gr4tc Hio- Lahs, acnil. OK I was aliathcU lo universal array slides, and 
ihe enure amoimi iif \bt tiybridiiadon mixture wa* added lt> ihe ehamtwrs. 
I lyhridizaoon «ii.s uirried oul »l MfC (i>r 2 h in l6c dark in a hvbridiMiion oven 
(Lab-Lme; VWK, Wesi Chester, PA). Following hybndizaliori, Ihe slides »eic 
rinsed wilh 5» SSC and »>shcd with Ix SSC-O lfr. SDS al 6(rC for 1.1 mm 
Adcr two more wash sieps of I mm each wrih 0,2x S.9C and 0,1 x SSC. respcc- 
lively at room icmperaiiirr die slides were tpin.diied and scanned using a 
Pro-Scan array (Pcrkin.pimcr, Wellesley. MA). Pnsiiivc signals were deiecied 
and iluanlilied wnh .'HanArTi.y Express (vcrsiun 3 9: Perkin-Klmcr. Wellesley, 
MAI and were manually inspeclod when necessary The signal mlensity data 
iitiiiiincil were ir,ini,fcrrcJ .11 icii lilcs. and only ihiise sddrcsscs where ibe signal 
inlenmy wjs 'Ill-fold higher ihiin hjtkgiound were tonsiitered posllive Rene- 
mint wcte perfDrmcd twice and idem itical ion conhrmcd for holh ctpenmciin 
IX)r) iif Ibt assay. I n dcierminc ihc limil of Ucicciion fl.Ol)). 10 fold serial 
diluUms III nrjl cullure slocks »clt prepared fin all liiul DKN V serotypes from 
siandard slock luhurcs. ivilh ilariing conccniiaoons of 2M».M> PFU/ml lor 
DhNV I (iiiiiin Hawaiil. 2Wm) PH.irml fur Ut;NV.2 (strum New C.uinea Cl. 
I .im.MII PFU'ml lor DENY. 3 (strain Philippines 1187), and Ifl.lKkl.OOd PFU'ml 
lor DI-NV.J (iirain Philippmel 11341) Tht conccnirsliiin ranges icsiej for the 



TABl.F 2. Details ol other flavivirus strains tested 



Huvivirus 


Strain 


Origin/yr 


Si Louis encephalitis 


MSI. 7 


MlsaiMiippi/1977 


virus 




Murray Valley fever 


0R2 


Victoria, Australi3/1951 


virus 






Kunjin virus 


MRM 16 


Ausiralia/1%() 


Powassan encephalitis 


Ml 1665 


Gniario. Canada/19A5 


virus 






Yellow fever virus 


I7D 


Ghana/1927 


Japanese enccphalilis 


SA14-I4-2 


Chinu/1954 


virus 






West Nile virus 


Unknown" 


New York City/ 1999, 2(X)0 



" I nidcntilicd sMom coHecled Inim mosquito pools (3k) 



different serulypes were as follows: 2.5 x II)' 10 0 0023 PFU/ml ( I 75 . nr n, 
1.75 X llr' PFU/rtacikm) fo. DRNV I, Z V x 10 lr.lM12V PPU/nil (i « UP 
lo2.l)3x lU PFU/reiwlion) for DliNV.J, I 2 X 10" lu 0 001 2 PI U.'ml |« 4 « 
Id' 10 0 84 X ll)-< PFU/rcacuon) for UKNV J, and 1 x 10' to 001 PFU'ml (7 x 
10' to 0 7 X lO-' FFU/reaclion) for DF.NV.4. Dilutions were prepared m IJiillx:c. 
to\ minimum essenciol medium supplemcnlcd with IU% fetal kivinc imm iln 
vilrogcn. Carlsbad, CA) RNA was enractcd from 1411 of each diluiion. and 
RT.PCR-l.DR widi » universal array was used 10 delcmiinc the lower limit nl 
detection 



RESULTS 

\ssa)/ design and Initial opdmiiation, Iniiial optimiziilion 
and vahdaiion of ihc multipleji PCR.LDR Hssuy for the Jcicc- 
linn and idiiniificitlitin of DENV serulypes wcie performed on 
38 culture supernaianis from stni;k virus cultures Hroducl.s of 
PCR amplilicalion were analyzed by cleclniphoiesis on a 2'^ 
agarose gel, Boih uniples (each serolype amplified sspaiaidy 
lor cadi gene largct) and multiplex (all .seroiypes amplihed 
together fur each gene target) reaciiuns were mdividuiilly val. 
idaied, and PCR products were observed at .3(XJ bp and 400 bp. 
respectively, for gene C and gene E (data not shown), LDRs 
were (hen performed on Ihe PCR products, and Ihc LDR 
products were analyzed by CE. The assay was able lo succes.s- 
fuliy identify 37 oul of the .18 .samples and lu type them cor- 
rectly compared to standard detection by real-time PCR at the 
CDC in Puerto Rico (data not shown). The sample that failed 
identification was found lo have insulTicicni nucleic acid Our 
initial optimizalion and validation experiments for ihe detec- 
tion of DENV serotypes were carried out successfully by ana- 
lyzing LDR products from two amplicons (one each in gene H 
and gene C) (data not shown). /Analysis of ihc third ampliciin 
did nol provide any additional informaiion for the detection or 
differentiation of DENV serotypes. All further expcrimcnis 
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were iherefiire performed using primers for only rwo ampll- 
tons: 40 PCR primers and 75 LDR primers in all (sec Tables 
SI u> S4 In ihc supplcmenial material), 

Assessment of Ihe nmltlpleji capabilic> of the PCR-LDR 
assa)'. A mullipltx PCR amplificalinn ul hiHh Urgcl regions 
used all 46 PCR primers in a smgle rcacilon. SubsequenOy, 
amplicons were subjected cn a multiplex ligation reaciion using 
Ihe 73 primers for all four DENV serotypes in a single reac- 
tion. For Ihe initial validation. LDR products were analyzed by 
CE. Results from representative samples are shown in Fig, 2. 
The sizes and number of LDR products at a single position 
may differ dcpendmg on the target viral sequence in that par- 
ticular ligiitiiin position Table ^ shows Ihc observed lengths of 
Ihe LDR products for each serotype, which differed by I to 3 
bases from those e.tpccted. A given DENV serotype can pro- 
duce lour or five dilTereni ligation products of approximately 
7.1 111 92 bases. Due to variations in product length and the 
comigraium of products of similar length, CE is not an optimal 
method for the interpreiation of the PCR-LDR assay (Table .1 
and Fig, 2). In a similar study, we have observed that LDR 
products of identical lengths but different sequences (due to 
the use of degenerate oligonucleotides) can migrate at sepa- 
rate positions in CE, resulting in broadened peaks that may 
overlap with nearby peaks (38), Other studies have also ob- 
served thai It IS diricull to predict CE results ha.sed on the 
lengths of the primers, beeau.se the migration pattern cannot 
be deduced from the length of the product (28). Therefore, 
afler sue«cssful optimization of the PCR-LDR, the universal 
array was used for the analysis of LDR products. 



TAHLf 3 Comparison iif Ihc expected siics of Ihc ligation 
prciducLs and the siKS observed hy CI: analysis 



Scrnrviw 


Targcl 
tJclv'Clcd 




DENVl 


Gene C 


76. «J 




Gene E 


7(1. 78. H8 


DENV. 3 


Cienc C 


7f), 86 




Gene B. 


85, m. 92 


DENV .1 


Gene C 


74. 74 




Gene E 


7.S, 78, 78 


DENV-4 


Cctie C 


7«, 84 




Gene E 


7(1, 8J, 86 



Stzci til I. PR prtHlutl-* (hp) 



ObstrvcJ 



74.5, 76-76.2. 83 .1 S3 6 

73 6-74. 75.9-76.2, 88.6 -88 8 

76.2-76.4, Si "J-ah.O 

82.4- 82.6, 84,7-84.8, 91 7-91 9 
73 7-73 9 

75.2-75.4. 77.4-77,6 

79 8-80,0. 8I ,.)-«1,4, 83,8-84.1 

73.5- 75 7, S2.()-R2,8, 84,0-84.3 



The detection of LDR products on the universal array is 
mdependem of their size. The serntype-specifit products 'hy- 
bridize to unique complcmentai> zip-code addresses on Ihe 
array. The LDR products are delected by (luoresecni ,signals 
appended to their 3' ends, A typical universal array layout is 
shown in Fig, 3A, All zip-code addres,scs were spotted in du- 
plicate. Two positive signals were required, one from each 
ampliciin or hoih frum a single amplicon. Inr idcniihcduim 
(Fig, 3B). Two additional zip-code addrc.s.ses were spoiled fur 
DENV-4 due to the use of additional LDR primers designed 
for the detection of gene E; however, for any given DHNV-4 
serotype, only hve posiiivc signals were observed on ihe anay. 
Tv/o addresses on the array produced positive signals for both 
DENV- 1 and DENV-3 due to similariiies in ihe primer se- 
quences This did nol, however, interfere with the uJcniihta- 
lion of the two serotypes, because the other addresses clearly 
distinguished them (Fig, 3B), 

Identification and typing of DENV from clinical samples. 
Serum specimens from 161 cases of DF confirmed al the Den- 
gue Branch, CDC, Pueno Rico, were analyzed by the PCR- 
LDR assay. The results are shown in Table 4, It was possible to 
identify !59 of the 161 positive samples correctly (sensitivity, 
98,7%), Serotype ideniilication was concordant for all 1.59 
.samples in which DENV was detected. The twu samples that 
tested positive si the CDC but could not be detected by PCR- 
LDR were archived samples; material could have been lost due 
to prolonged .storage or multiple frcczc-lhaw cycles. Spccihciiy 
was evaluated using a panel of 189 serum samples that tested 
negative for the presence of DENV, Three of the negative 
samples were idcniified as DENV-2 by our as,say (specificity. 
98,4%). Two of these samples came from patients presenting 
with acute symptoms of DP but were found to be DFNV 
negative when tested by RT-PCR and an enzymc-linlsed im- 
munosorbent assay al the CDC; no second, paired serum sam- 
ple, which might have been tested for seroconversion, was ever 
received from either patieni. No clinical data were available for 
the third patienl. but the sample was found to be negative upon 
repeated analysis by real-lime PCR, 

Performance of the PCR-LDR against the glubal DENV 
panel and other flavfviruses, A global DENV panel consisting 
of SIX strains each ofDENV-1 and -3, mne .strains of DENV-2, 
and five strains of DENV-4 was tested using PCR-LDR w,\h 
Ihc universal array for deteciion and seniiype ideniilicaiicn 
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DENV- 



DENV.2 



DENV-3 



DENV-4 



HG. 1 DuccD.m of DENY by Che universal array (A) Universal micToarray layout of expecied positive signals (filled circles) for each DENV 
SZ.i fn^'^!" wore .pmicd m duplicce. Open cirele, md.cale addresses avaifable for ,he deleenon of other viruses as ih assay is further 
devel„ped. (B) Rep esentaiive universal array dcteetion ol DENV-l .0 -4 using the RT-PCR-LDR assay. Each serotype generates ai lea<i fnu 
or five uiiiMtie signals that permit ideni.ficailon, DENV-l and DENV-J produce tsvo common signals, as described mThe ie"i 



The strains belonged to different genotypes and came from 
different geographic region:. (Table )) (6), All cxcepi one of 
the sirains (strain idcnurtcjlion (ID| If)ft74: Daltar, 1970) were 
tlciected and correctly serotyped by PCR-LDR Sequence 
analysis of gene E and gene C confirmed thai ihe isolate had 
significant genetic variation ai ihe PCR primer binding sites, 
T\vo additional PCR primers were subsequently designed and 
added to the PCR primer mix in order 10 accommodate the 
sequence difference and permit the amplificalinn of (his isolate 
(see Tables SI and S2 in the supplemenial matcnal). 

A panel of seven other flaviviruses (West Nile virus, Kunjin 
virus, Posvftssan virus, Japanese encephalitis virus, Si. Louis 
encephalitis virus. Murray Vtillcy encephalitis vims, and yellow 
fever viru^) wa,s tested with the assay, and CE results are shown 
in Fig. 2. The PCR-LDR primers were specific for DENV 
serotypes, and no signals were detected for any of the other 
(laviviruscs hy u,sc iif CE or the universal array 

Determination of LOD, Using serial dilutions of the virus 
WKk cultures fur RNA cxtraciion and subsequent PCR-LDR, 
detection limits were found to be 0.017, O.OCM, 0.8, and 0.7 



TABLE 4 Delcciiun and scrotyping of DENV from clinical scrum 
sample" using a multiplex RT-PCR-LDR-universal array 



CDC scruivpe 
tlcUTfnindlKin 



DENV-l 
UENV.2 
OENV-.I 
DENV 4 
Negiiiive 



Nil uf 


Nn of ia^nplcn wHh the folk;* 


ng scrolyrw by ihc 


sumple* 




KT-PCR 


LDR-univrrsal array 




itisltd 


DENV 1 


Dt:NV-3 


DliNV.J 


DENV-a 


Negative 


20 


20 


0 


0 


0 


0 


62 


(1 


02 


0 


0 


U 


S<) 


U 


1) 


.S7 


0 


2 


20 


0 


0 


0 


20 


0 




« 


.1 


0 


0 


1K6 



equivalent PFU of virus for DENV-l, DENV-2, DENV-3. and 
DENV-4, respectively. Figure 4 shows the universal array diitfi 
for the DENV-l detection limit. Detection limits were found 
to be the same by using CE or the universal array and were 
calculated by taking the dilution factors into account; i.e., RNA 
was extracted from 140 p.! of each dilution and was subse- 
quently elgted out in 60 pi of buffer, from which 6 p.1 was used 
for cDNA synthesis. 



DISCUSSION 

In this study, we reporr the development and evaluation of a 
nucleic acid detection assay, based on RT-PCR followed by 
LDR with specific primers, for the simultaneous ideniilicaiion 
of all four serotypes of DENV in a single reaction. 

We used the PCR-LDR approach to overcome the high 
degree of sequence variation between DENV serotypes. The 
wide distribution of DENV serotypes in the tropics and sub- 
tropics allows for considerable gcnotypic heterogeneity among 
the circulating serotypes and has also been linked to the viru- 
lence of the circulating strains (5, 12, 17. 23, 30). The foui 
different serotypes of DENV arc thought to be more di.ssimilar 
than different "species" of Flavivirus (17). Such viral genetic 
heterogeneity has implications for the design of molecular 
assays. False-negative PCR results due to sequence mis- 
matches between DENV RNA and the primers used in real- 
time PCR assays have been reported, necessitating assay revi- 
sions (6, 16, 36). In Ihe design of our assay, we used several 
primer pairs with degeneracies to accommodate po,ssible fail- 
ures of amplification due 10 sequence mismatches The enve- 
lope (E) and capsid (C) genes were chosen as targeis, because 
Ihcy were found In have suUicicnt variation lu be usclul kir 
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no •) I OD of rhc PCR-LDR assay Microurray images show ihe LOD for DENV-I, The number of PFU used in each PCR ,s i,,v=n 
Fiuna-^i^m s,gnal« ^lIMold higher lhan Ihe bnckground negative signals were deemed poMtive, ^ 



serutypc diffcrtntiulion. However, due to ihc inherent hetero- 
geneity in the E region, a large number of primers had to be 
used tu account for a)l anticipated sequence variations, Prim- 
ers were designed for rwo regions in order to provide a certain 
amouni of redundancy that would circumvent any dropouts in 
PCR airipliflcation due lo sequence variation. 

A multiplex PCR-LDR assay was optimized such that all 
primers toulj be used in a single reaction for ampiirtcalion of 
all four DENV serotypes. This approach was found to be 
>9«W sensitive and specific in Ihc detection and serotype 
identificaliim of DENV from 3.'iO archived acute-phase .scnjm 
samples and to compare favorably to techniques described 
previously (11, 24, 40). The asjsay has been developed as a 
prototype for the multipleit detection of RNA viruses and will 
be expanded for ihe detection of other hemorrhagic fever 
viruses. Therefore, a mulliplcx formal was adopted lo define 
the feasibility of the assay, and ii was possible lo multiplex 46 
PCR primers and 7,5 LDR primers in a single reaction, with 
signihcani signal inicnsity m Ihc CE and microarray formats. 

The primers were very specific for DENV scrolypes and did 
noi crnss-rcaci wilh any of Ihe other flavivlruses lesccd. Only 3 
of ISV samples were positive by PCR-tDR but negative by the 
ical-iimc PCR assay performed at Ihc CDC. Upon analysis of 
clinical hndings. il wa.^ discovered ihal the serum samples from 
two of these paiicnis were collected within I to 5 days of the 
on.sc! of symptoms, suggesiing ihat they may have had acute 
infections. Seroconversion could not be demonstrated, because 
convalcscenl-phase serum samples were never obtained. It is 
therefore possible lhat these patients indeed had viremia that 
could not be detected by real-time PCR. No clinical history was 
available for the third patient whose sample was retrieved from 
the CDC repository archives, 

Current reports suggest that several lineages of DENV-2 
and DENV-4 circulate in Puerto Rico and lhat DENV strains 
CKhibii complex patterns of lineage turnover and extensions (3, 
4) Thus, ii is likely lhai several lineages of each DENV sero- 
type could have been circulating within Ihe time period en- 
compassing Ihc culleciion of specimens included in the panel 



of clinical samples used for this study, which were dctccicd by 
our assay wlih high sensitivity and specificity. 

In order to verify the ability of the PCR-LDR assay to dcteci 
different genotypes of DENV, we used a global DENV panel 
Our assay was able to detect all but one isolate (strain ID 
10674), a sylvatic strain from Africa (G.-J, J. Chang, persona) 
communication). The analysis of complete genome sequences 
suggests that sylvatic DENV-2 isolates are evolulionarily dis- 
tinct from endemic DENV-2 isolates and supports the classi- 
ficaiion of DENV-2 into two discrete ecotypes (4, 43). (n an- 
other study, real-time PCR primers for DENV targeted lo the 
C-PrM region had to be modified lo delect variant sylvatic 
virus (6; G.-J. J. Chang, personal communication). The se- 
quence of sylvatic isolate 10674 was not available when our 
pnmers were initially designed; subsequently, two new PCR 
primers were added, and it was possible to amplify this isolaic. 
The addition of two new primers did not adversely affect Ihe 
ciTicicncy of the PCR, indicating the llcxibilily of our a.s.say m 
the idenlificalion of new genotypes. The evnlulionary genetics 
of DENV suggest that its population is becoming increasingly 
diverse, and the presence of sylvatic reservoirs of the viru.s muv 
allow the miroduclion and emergence of sylvatic virus m hu'- 
man populations, li has also been suggested that ihesc condi- 
tions could lead lo the development of new pathogenic Of-.NV 
strains (17). Wilh the PCR-LDR strategy, it would be possible 
to incorporate new primers for the detection of these variani 
genotypes as they evolve. 

The detection limit of our assay ranged from 0.004 to 0.7 
equivalent PFU/reaction. The lower LOD differed for ihc dif- 
ferent .serotypes; the assay was 100 times more sensitive for 
DENV-2 and DENV-1 (LOD, 0.004 and 0.017 equivalent 
PFU, respectively) than for DENV-3 and DENV.4 (LOD. O.S 
and 0.7 equivalent PFU, respectively). Variable LOD have 
been reported for DENV serotypes in several different studies 
Johnson el al. have reported an assay 100 times more sensitive 
for DENV-2. -3, and -4 (0.0016 to 0.0O8 equivalent PFU) than 
for DENV-l (0..1 equivalent PFU) and postulated thai the 
difference could be due to differences in ihe proponion ol 
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nonmfeciious RNA transcripis lo Infectious particles (PFU) 
between DENV-I and the other serotypes (20). In a compu- 
rable study, Ui ct a! have reported variable LOD of 0.1 PFU 
for DENV-I and -2, 1 PFU for DBNV-3, and 0.01 PFU for 
DENV-4 (26) In yet another real-time PCR-based a,ssay eval- 
uated rtccnily, scroiypc-specfHt primers were 10 limes mure 
sensitive for DENV-2, -3, and -4 (LOD, 0.1 PFU) than for 
DENV-I (LOD, 1 PFU) (33). The LOD observed in the 
present study is comparable to those for the other techniques 
reported (20, 26, 40). 

The u.se of zip-code addresses for spotting the array and its 
unK|uc polenlial ui recognize pathogen-specific zip-code com- 
plements appended to the LDR primers obviate the use of 
rtctual genetic sequence for pathogen detection. This is espe- 
cially useful for multiplexing: a large number of genomic tar- 
gets can be delected using this array, since the zip-code ad- 
drcs.ses arc synthetic oligonucleotides. Additionally, the same 
array can be u.sed for the simultaneous detection of different 
organisms, since positive hybridization is dependent on the chem- 
ustry of (he synthesized zip-asde oligonucleotides spotted onto the 
array and their complements appended to the pnmers. 

The current study was undertaken to prove the feasibility of 
using the PCR-LDR technique for the multiplexed detection 
of RNA viruses. Since viral infections often present with 
nonspecihc clinical symptoms, and arbovinises have similar 
geographical disrribution.s. a multiplexed approach lo the de- 
tection of several arboviruses in a single sample could he ex- 
iremely beneficial. In addiiion to serotype idenlihcaiion, a 
criticiil element of epidemic control measures is the early iden- 
iilitiiiion ol emerging ficnolypes and the rcplacemcnl of a 
genoiype(s) in a given geographic region (37), Since some 
DENV scrorypc.i have mure than four gcnorypes, PCR-LDR 
may be useful in genotype detection and differenlialion with- 
out the involvcmeni of nucleotide sequence analysis and phy- 
logenciic studies. 

Wc envision extending the scope of the assay to peimii the 
multiplexed idenliricafinn of a panel of hemorrhagic fever vi- 
ruses. Our group has already reported the use of the PCR- 
LDR assay for the identification of a panel of 20 different 
bacteria in blood cultures (34) and has developed an assay for 
the detection of West Nile virus (38). The DENV detection 
assay has been developed using similar principles, and the 
technique is currently being developed for use in the simulta- 
neous idcntificalion of hemorrhagic fever viruses. Ultimately 
ihc technique may prove useful in a comprehensive assay for 
blood-borne infectious agents. 
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of .l,e WNV?DNrili''!.irp"'''H'pr"/''''^'^^'^'* ""Y ^^"^ '^"'"P" •''^'^ f"""" P="" ""'8"''' <'"'P"fy "Tee d.s.inc, region, 
»■ ^ „ K , l'",'"^'^' E"'^" .f^CR pnmer coniaini, l«iween one and three degenerate positions to accommodate minor sequence vanaiion 
n. the pi^imei biiiUing sites. For simplicity, only one PCR ainpticon is shown. The presence of each PCR amplicon .s detected tiy l.DR pnmer pairs 
specihc tor three regions svuhin each ampNcon, The .V upstream LDR primer, l,ear rip code complements, while the V t.DR downstream primers 

.\,:i"^'V I'T'^'" i J^^ "L'"*"' "'^ P'"^'' P"" fl""rescenlly labeled producu o( different lengltis 

thai can be delecicd eilhcr by CE or by hybridization to a universal DNA microarray, NTR, noniranslatcd region 



us well K the numerous and emerging strains of WNV, are 
therefore needed (34). 

In this rcptJrl, we describe the development uf a new, sen', 
.i.ittve assay for the dctectism of both lineages of WNV, based 
on muliiple.i( RT-PCR and the ligase detection reaction (LDR) 
(Fig, 1). LDR was originally developed for discriminating sin- 
gle-brisc mutations or polymorphisms (J, 4). The technique is 
amenable to multiplexing and has been used to detect muta- 
tions and single nucleotide polymorphisms in cancer genes 
(14-16. Ig, ly) and recently to detect and identify bacteria in 
clinical blood samples (29), 

We describe the validation of the technique for WNV de- 
leciion and idcniifieauon wiih mosquito pool samples, clinical 
isolates, and national, as well as international, strains. The high 



sensitivity and broad strain coverage of the multiplex RT-PCR/ 
l,DR assay could render it a valuable complement to WNV 
.serological diagnosis, especially in early symptomatic patients, 
In addition, the multiplexing capacity of the technique makes 
It amenable to the development of a more comprehensive 
assay for viral pathogens. 



MATEIUAI^ AND MEITIODS 

Viruses. Tile WNV ilrains uKil in ihis sludy (Tabic I) from Uj!»ncia. Irjnte, 
Israel, and New Vorlt (NVVSl) btkingcij lu the European Ntlwoik lui Dugniis. 
Iici of Imporlctl Viral fiiteaxs and *cic provided 1:^ M Nicdrij ai llic Riihcri 
Kocli Initliiiic, Berlin, C/ermany (2K), All o( ihe olher llrains *err ntiuinwl Inim 
ihe World Krfercncc Ccnlci for t;mcrjin|j Viru.sc> and Ar1iiiviru,<M .u ilu 
Universirv of Texas Medical Branch in Galvcslun (7), 
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TABLE L WNV strains used In (hii study 
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TX 2fi7(! 
AR 277) 
I5476.(M 
IL um\ 
CT 84')5-04 
1 X 57S4 
101-21112 
LA <)24)\ 
CJRLA 1261) 



Lincojc-tliiOcis) 
(reference) 



ArB.3 10/67 

lbAn70!9 

ArD-27S75 

31A 

6H8.'>ft 

EgypIlOl 

ElhAn4766 

MRMir)(Kun/in) 

K5433(Kunjin) 

IG-15578 

S049<)4 

ArD.761l)4 

ArB.1.573/fl2 

SAH-442 

SPU1I6.S9 

ArMg-979 

DaliAnMg78 

OT'>74..S 

97-103 

B 956 (WNFCG) 
Ny99 

Ml 2294 FrOO 
WNV 0043 
TVr-9497 
TVP 911.1 
TVP 10207 
TVP S833 
TVP 9241 
TVP 95.16 
TVP 100.10 
TVP 101 22 
TVP 8H.'i2 
TVP 9177 
TVP 97+4 



Cenlrul African Repiiblic/1967 

Niscria;i965 

Sensgal/1979 

Uniled Sisies/lW9 

India/1 968 

Egypl/l9Sl 

Elhiopin/ 197(1 

Auilralia/19(iO 

Aiislralia/1991 

India/1957 

Bangaiure, lndis/l9S0 
Senegal/1990 

Central African Rcpghlic/1982 

Soulh Afnta/1958 

Soulh A/rica/1989 

Mad8ga.'icar/I988 

Madogn5car/197e 

Cyprus/1968 

Czech Republic/1997 

Ugandii/1937 

Unilcd Slatcs/lV99 

Fnmcc/2000 

Israel/2000 

Sonora, Mcxicii/2004 

Tabiisco. Mciico/2003 

Tcxas/2006 

Quebec, Canada/2(X)2 

Nchraska/2003 

Illinois/2004 

Conneclicul/2004 

Tc«as/2IXlfi 

Flonda/2(X)l 

CaliforniH/2004 

Califomia/20()3 



lb 
l-b 
!-h 
l-b 
La 
l-a 
La 

KUN l-h 
KUN l b 
IND 1-c. 5 (9|' 
IND l-c, 5 (9)" 
2 

2 
2 
2 
2 



1 (I)' 



^ Cla-Mificd also it. a new d(siniti rifm lineage (clade 5, see D«cuMimi|. 
l.l»«siiicd viihei as t new ihird lineage or at i novel flanvirut wtlhtn ibe JEV groirp. 



For scnsiiivily lesting, i WNV load panel conlaining plasma Mmplcs jprked 
with defined diliiiKPMS ri( hfVlfl v.rvs wa< kindly provided l>y R I jirifioii, linra ihe 
Ccniert fnr Disease Conirul and Prcvcnliun (CDC) Uivwiun of Vtctur-Bume 
Inlevnnus Riteawi m Furl Collins, CO |2I| The panel lilen ranged Inim 
\HI)mt lo U lS I'FU'ml (Muanltfiedby slanijaid pl.muc ulsayl, Ullier flanvmiics 
used lildcienliinc Ihc spccilicily nl llir j«ay were nbliincd Irnm varmus snureei 
Ihe %\ Ijiuis enecphalilis. Murray Valley len-r, hiwossan cncephalilii. and 
yelkiw lever viniws were iilil.iined irom R, Uncioni iCfJC, I on Collmi); /hV 
and licK-lmme encephali(i> virus were ohnined from M Niedrig (Rnbcrl Koch 
liwinuie, Ikrlin. Oermaiiyl. and dengue virus scrolypcs I lo IV were oblaincd 
Iriim I Muftoj (CUC Dengue Branch m San Juan, Pucno Rito) (Table 21. 

MDiquilD pools and clinical aampln. Ninely-cighi posiiive and 20 ncgaiivc 
rrnunuili) pinil iamplci vvcre obiained from ihc Nesv York Cily Depimmenl g( 
lleallh jna Menial Hygiene (NYC DOHMH), which collcdj and (esis irappcd 



in(»<5uiioes Ironn dilTcrcnl ireM in New York Cirj' as pan nl iis WNV (ursc.l 
lanre program (W. V). 

Plasma samples Irom M) NAT piisiiive hlnod donors and n NA f ncgaiivt 
donors were tesled The poJilive plasma iamplcs were prnndcJ hy ilie (lull 
Coaal Regional Blot«! Cenlcr in Houlion, rJ>(: ihc ncgalne samples were oh 
lamed from ihe CDC Dengue branch m Pucrirr Kico Iweniy .lUdilKinal pliNma 
samples which lesled pcsitive foi dengue virus and negative foi WNV wire .ilsii 
provided by Ihe CDC in Puerio Rico AsWiliunal samples lor iniligl primer 
volldalion were ohlamcd from ,<i Slramcr al lllc American Red Cross Narional 
Icsiing and Reference [..aboraiones. Gaiihenshurg, MP 

RMA ctlrtcllosi. Samples from mosquilo pool homnfenalcs were clanHed by 
ccnirKugalion al 2.00(1 x j (or 3 mm Viral RNA was c«lracted Inim the 
musqlilio pool sample supernalunls and clinical samples, al well as (rum vii.il 
seeds, wirh ihc OlAamp viral RNA mmi kii (QIagen. Valencia. C A) according i.i 



TABLE 2 Other (laviviruses used in ihis study 



S""" ' Orig,n/yr ^ 



Dengue virus scrcitypes 1, II, 111. IV ND- Puerlo Rica/2fl04.2006 NU 
.S^, Uuis enccphahl.s virus m.SI-7 M,m«,pp./1977 ixlO' PFLI'ml 
Murray Valley (ever virua 0R2 Viciuria Au.lralia/I95l ' ^ '"^ip^ 
P<iwus.san encophaliiis viri.s Mllfihi Ontario, Canada/l 965 nd 
Yclloiv lever virus on r.h..„,/iun , s 
JFV cI m la •> Ghan3/1927 5 X III* PFUiml 
TiK SAI4-14-2 China/1954 2 v lO'cuoics/ml 
Tick-hiirnc encephalilis virus k23 Ri-fnr,.™-- 57 2 's u cupits/ml 
_L welcrcin-e 27 | x 111' copies. ml 

' ND. not tlelerniined ~~ — — ~. 
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TABLE J. Primem useti for PCR and RT-PCR 



region 



Forward primer sequence" (5 -.V) 



Reverse primer sequcnec" ) 



T... I"C) 



\mptKun 
k'Mjiih 



I (NS2a) WNV PCR F5, GCCAACTACCGCAACAC A 
GVtGGGCCTTMTGGTCGTGTTC 

WNV_PCR n. GCCAACTACCGCAACAC T 
GGCCACtCAGGAOGTCCTTC 



2(N.S.'>) WNV PCR F9A, GCCAACTACCQC/\ ACAr 

OATGTG'OAAGAGOCGOYTGGTGTTA 
WNV_FCR F9B, GC CAACTACCOCAACAf 

GGTGTGGTAGAOOCCGCTOGTGCTA 
WNV.PCR_FIOA. acCAA,aACOGgAA£AC 

AGAAGAGGGTKCAGGAAGTGAAAGG 

GTACAC 

WNV^PCR FIQB. G.CCAACTA CCGCAACAC 
AAAAAAOAGTYCAAOAAGTCAGAfiGG 
TACAC 

^ (NS5) WNV_PCR FUA, GCCAACTACCGCAACAC 

gcoaacc:acccctacaggacctocaac 

WNV. PCR FI3B, G CCAACTACCGCAACAC 
GAGAACCACCCATATAGAACCTGGAAC 

WNV_PCR_FMA, GCCAACTACCGCAACAC 
CAATGTCACCACGATGOCCATGAC 

WNV PCR FI4B, G CCAACTACCp CAAf AC 
GAATGfYACCACMATCCCCATCAC 



SI-8J W NV_P CR R7A. CCaACTACCGCAACCa 
CTTTGATGAGGCTTCCAACTCCAA 
CCAT 

8.1 WNV.PCR.R7B, CCAACTACCGCAACCA 81-S3 
CCCTGATCAARCTGCCTATTCCRA 
CCAT 

WNV_PCR,R8A, CCAACT ACCGCAACCA 79 

ATGAGOCAACCTCCTTTCrnTTTGC 
WNV_PCR RSB, CCAAC TACCGCAACCA 7S-S1 

A R CAG ACTK G CTCCTTTCTTYTTTGC 

82-83 WNV. PCR R 1 1 A. CCAACTACCGCAAqC Si 

iCCCAGAACCACCTGGCTWGTCAT 
85 WNV PCR RUB. CCAACrACCGCAACC 8M2 

ACCTARGAGYACCTGGCTGGTCAT 
8.V84 WNV.PCR_R12A. CXAACIACtSCAACC "'-82 

AGGTCCCTrCCAKGTYTTCTTYT ' 

CCAT 

SI WNV PCR RI2B. CCAACTACCOCAArr 8)-S1 
aOOTCCCTTCCAGGTCCTYTTYT 
CCAT 

SJ WNV PCK RISA, CCAACTACCGCAACr 7«-HI) 42(>-'iWl 

aGG VrnTTCTC YCTCITTCCCATC 
82 WNV PCR RI.SB. CCAACTACCGCAAf C 79-80 

AGOmCTTYTCTCTCTrSCCCATC 
82 WNV PCR RI6A, CCAACTACCGCAACC 81-83 

ACAGA/\A0CCAGCTCCKAGCCAC 
8CM2 WNV PCR RI6B. CCAACTA CCPC AACC 83-85 

ACAGOAAGCGRGCYCCCAGCCAC 



I he unrvcrwl ml sci||prnU!s .ipptnUcil lu itie ^' ends i)( dll primcrj »ic underlined 
nicijmn icmpcraiure 



lire manufaclurer 5 inilruelmw. RNA wii clulcd in Kl (il and s(r>red si -IW 
uiiHf ii*ed One nc(i;itivc mui orw p<rsilivc vJitrnenon ednirut were proccwrf 
alun; wilh caih j/Dup ol HI wniples tubjcacd lii KNA olraeliun 

.SrlKCkm or nrjel n^iom for rnulliplK PCR/l.BR. Thirty nine WKV com- 
plele uenumic i*i)genvei ;ivailiible frum Ihe GenBunk dal.iha« (iwcesjoU in 
Jiinuary 2imi) wcie aligned hy uiing ihc CluslalW algorilhm lu leletl opiimal 
primer hindiiij regions l iicse regions were eharacleri7cd hy a foghei degree of 
conwrvaiion among dilfercm WNV iitdini JO an lu achieve m««mum Wruin 
eoveragc. Primer wis wilh parlially overlapping wquences were designed lo 
simullancously amplify lliree dillcrcnl rcgi<in< (Inlegraled (JNA Technology, 
Coriilville. IA| One region was in ihc coding Ktfuence of minslruclural prolein 
MS2a. and nvo regions *ere in nnn»lru«ural prniein KSS freiipccllvtly. 6, K. and 
8 PCK prrmcrvwcre used, for a lulal of 22 primers) liath of Ihe Ihree ampliconi 
wai -Sl«i hp Each piimci wijuenee ciinlained no more (hal two degericralc 
ixmiions and Inul a melting [eiii)>er»liirc of .iriiund KITC (Table .1). 

A maiiir hairier lo higti-»ensiliviiy mulnplescd PCR implificalion in inlicr 
systems has heeo Ihe exponential incrca«e in polenlnl false amplicons and 
primer dimcrs ihal ri>,ulls rnirn using Imi many PCR piimcu m ihc same 
ru.ielion mniure We circumveni ihn polenlial pnfall hy gsmg PCR primers 
containing ' universal tail sequences on Ihe forward f.'i 'GCCAACTACC'GCA 
Al AC-,1) and reverie (S' CCAACl ACCGCAy'iCC A 5 I primers (Tahlc 3) 
I'rinier dimeis .rnd short .ihcrrani aniplicons do not amplify snicienlly hccmse 
such priiducis form panhandle siruciiircs. The correct PCR proiJuci il just the 
right size lor cSicicnl implificatiun llud lo Ull) bp), while false longer amplicons 
do nol amplify as well. We have successlully applied ihB strategy where standard 
multiplesed PCK has tailed (12, IS, I') 

LDR prime nt were chosen In three diCfereni conserved rcgionl within ei^ch of 
the three PCR implicmis and. just as for the PCR ptimeri. designed with the 
intent of achieving ihc highest llram coverage, taeh LDR primer pair wa. 
composed of an upstream prohc hearing > 20-mer r.ip code complement se- 
quence ui lu S" end and a downsiieam probe hearing a t-cirtHixyfluorcscein 
duoiophurc at lis .1 end. The Jip code cumplemencs are unique 20-mcr oligo- 



nucleotide sequences complementary to the up code addresses <|iotied on the 
universal fJNA mivroarray ( 18). 

Punng LDR. hoih upaiream and downstream probei. hybiidizc to the lemplaic 
.sequence and ligation occurs only when (here is perfect complementarity at the 
ligation )unclion (pig, 1). l-orty-nine LDR primen largehng a total of nme 
regions (three LDR primer pairs per PCR product) were designed, with the aim 
of detecting att many strains as pos.siWe and potential new variants ( lahic -il 

LDR pruducts ranged from 72 to 88 Ijp in length. The oligonocleoiide prohes 
(IniegratesJ DNA Technolngy. Ccjralville, lA) were designed to have similar 
thertmxlynamic features and to avoid hairpin and sclf-dimer I'orinalion 

Two-ilep mulllplei RT-PCR. Rl reaclions were performed in a ft(l-|il viilunie 
wilh Ihc Superwnpt First Strand .Synthesis System for R7 PCR (Inmrogen. 
Carlshad l Al and randooi hcumera Brictly, JU |*l ol RNrt cmracied Irom 
moiqullu pools or from clinical samples was mixed with 6 ^1 o( 5lJ ng/ml random 
hcsamers. 1 pi of a II) mlvl deiixynuclcoside inphosphate (dSTP) mmure and 

I k1 ol water and denatured at fiS'C for 5 min. After tooling on ice, I v n | hullci 
(21) mM Tris-llfl. 50 mM KCI), S mM fvlgCl,, U 1)1 M dithiolhreitol. and J L) ol 
RNuseOUT were added for a 2-min imuhalnm at 2-t"C Si« uniis ol Superscripi 

II RT was added, and Ihe mulurc was incubated fiisl at 2.<'C (or lOnon and ihen 
at 42"C lor 50 mm The reaction was terminated hy heating at 7()'C lot is min 
Degradation of residual RNA or cleavage of RNA-DN A hybrids was acnicvcd h> 
incubation with h U of RNase H for 2ij min at ,1TC 

Five microliters of newly synthesized cDNA was subjected Ui multiples PCR 
implification in j final volume uf 23 til which contained I' OeneAmp ("LH 
Gold buffer. 1,5 mM MgCli, 2IK) iiM each dNTP, S pmol of each PCR primei, 
and I U of laii polymerase (AmpliTaq Cold; Applied Biosyslems, hosier City, 
CA) After II) min of incubation at for Hot Start Taq activuiiiin, a iiital ol 
40 cycles wcie performed, each eonjrsiing of a dcnaturation step at ')4T lor 111 s 
an annealing step at dO'C for I min, and a eslension step »l '2"C for I min with 
a final extension step at 72*C foi 10 nun A nonlemplatc neilativc control -oitl a 
WNV cDNA positive control weie included in each round. All PCR thermal 
C7clinj was perfoimed in a Perkin-Flniei GcneAmp PCR System 9700 rhcrm.il 
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MULTIPLEX PCR/LDR FOR WNV 3:7,1 



TABLE 4 Pnmcri u-iod for LDR 



i>awtiiiic<ini pnmci w^urott ("i -.1 ) 



scqti«nc< (^'-J ) 



I (NS2a) WNV83UrQM, d'hiiijCTATC ATGCTTGCA 
CI tXTACTCCTAGTCilTrCJG 
(b-FAMl 

WNV83lhCOM, fPhosjCTATACTGATTGCT 
CTGCTAGTYCTGGTGTrrGG|6-FAM) 

WNVHJlcCOM, (Pho>)CCATACTGATTOCC 
CTGCTAGTTCTACTGTTT0G(6-FAM) 

WNV93yaCOM, (Phos)GAaACGTCG,TGCA 
CrrGCCACrrATGG(6-FAM) 

WN VVMbCOM , ( PhiwJG AG ACGTGGTACA 
CTTGGC:GCTCATGGi6-FAM) 

WNV9.WCCOM. (Phos)GAGATOTGOTOCA 
TCTGGCGCTCArGG(6-FAMj 

WNVIU:iaCOM, (PhTO)GACCAACCAAGA 
OAGTArrTTGCTCATGCTrG((>-FAM} 

WNVI02lhCOM. (Ph(is)GACCAACCAGGA 
GAACATVrrG7TGATOTTGG(6-FAM) 

WNV 1 02 1 cCOM, ( Phos)GACCAACCAAG A 
AAACArrCTGCTGATGTTGn(6-FAM) 

2 (NS.M WNV534UCOM, (Phcis)GTTGOAATATTGT 
TACCATC.^GAGTGGAGTCGACGTC 
16-FAM) 

WNV5.14lbCOM. (PboijCATGGAACATTG 
r G A CCA TOAAGAGYGGRGTGGATGT 
G(6-FAM) 



WNV5427aCOM, (Ptios)AGTCATCGTCAAG 
TGCCOaGG TAGAAGAACACCGC{6. 
FAM) 

WNV54271)C"OM, (Phos)AGTCYTCRTCAAG 
TGCTGAGGTTCAAGAGCATAGG(f)- 
FAM) 

WNV5427CCOM, (Pho5)AGTCCTCATCAAG 
TGCTGAAGTTGAAGAACATAGA(6- 
FAM) 

WNV5548aCOM, (Ph.>s)CCAAAGTGATTGA 
GAAGATGGAAACACTCC(6-PAM) 

WNVS54KbCOM. (Phos)CRAAAGTCATAG 
AGAAGATOGAGCTRCTCC 
(h-FAM) 

WNVJMScCOM. (Phos)CAAA,GGTCATAG 
AAAAGATGGAGCTGCTCC(fi-FAM) 

.1 (NSS) WNV<i(W4aC;OM. (Phus)GAGTCAAATACGT 
CCTCAATGAGACCACGaaCTGGCTG 
(b-FAM) 

WNV6094hCOM. (Phos)OAGTGAAGTATG 
TGC^CAAYGAAACCACCAAYTGGTTG 
(6-FAM) 

WNV(>f(44cCOM, (Phos)GAGTGAAGTACG 
TGa'CAAYGAGACCACCAAYTGOTTG 
(6-FAM) 

WNVhOV4dCOM, lPh<)s)GAGTTAAGTATGT 
CKTCAATGAAACCACCAATTGOTTG 
(fi-FAM) 



I. OR prudntl 
icn^ih [bp) 



74 

73-74 
74 
71 



WNV8JIG2p52, (NHj)TCCGTrCCGCCA 
GAGCGTGAGTGGACRGCCAAGATC 
AGCATKCCAG 



80 



73 



77-79 



74- 76 



69 



69-72 



WNV939aOZ,p53, |NH,)TCCCCGTCCGCT «4 
GTCTrTCCCGTTTGCTGAAGCRAAC 
TCAGGAG 

WNV939bOZp53, (NHj)TCGCCGTCCGCT H2-.S4 
CTCTTTCCCTTTCGCAOAATCYAAT 
TCRGCAG 

73 WNV939cOZ.pM, (NH,)TCGCCGTCCGCT HV-84 
GTCnrGGCCTTTGCAGAATCMAAC 
TCAGGAG 

71 WNVI021aGZp54, (NH,)GACCAGGCCTC 87 
GACCCACCCGGTGGCTTCCTrrrTG 
AAOGCAAOGTG 
71-73 WNVI02lbGZp54, (Nf l,)GACGAGGCCTC 87 
UAtCCACCCGGTGGCATCGTTTCTC 
AAAGCGAGATC 
71 WNV)021cGZp54. (NH,)CACCACGCCTC 86 
CACCCACCCGGTAGCATCATTTCTC 
AAGCCGAOATG 

76 WNV.SMIiiOZpSS, (NH,)GCCACGCTGCC 86 
AGGACTCCGATGAAGAACCACAACT 

ggtgc:agagctatg 

WNV534lbGZp55. (NHj){;CCAGGCTCCC 85-86 
AGGAtTCCGATGAAGAGCCCCAACT 
RGTGCAAAGTTATO 
\VNVS34lcGZpJ5. fNH,)GCCACCCrGCC 88 
AGGAtTCCGATGAAGAGCCCCAGCT 
AGTOCAGAGCTATO 
WNV5427»GZp56. (NHOTCCGGTCITGG 88 
TCGCTTCCCGCGAGCGACACACTGC 
TCTGTGACATTOGAG 
WNV5427bGZp56, (NH,)TCCCGTCrrGG 8^-88 
TCGCTTCGCTO YTGCG A Y ACYCTCC 
nTGTGACATCGGAG 
WNV5427cGZpi6, (NH,)TCCGGTtTTGG 85 
TCGCTTCGCTOTTCTGACACTCTCCT 
rrOTGATATCGGAG 
WNV5548aCZp57, (NH.)GTCTTCGCCGT 86 
GGGTCCCTCrrOCATCAAAGTCCTA 
TCGCCTTACATGC 
WNVS.HabCZpS?, (NH,)GTCTrCGCGGT 86-88 
GGCTGCCTGTTOYGTGAAGGTRCTS 
TGCCCCTACATOC 



WNV6094aGZp58, (NIFjJCCCTITCGnG 86 

GC-TGCGGACCACAAAGOCTCCAGA 

GCCrCCAGAAG 
WNV6(»4bGZp58, (NIDGCCTTTCGTTG 86-87 

GCTCCGCJACCACGAAAOCTCCBGAA 

CCGCCAGAAG 



75-77 



Candntitd an folli/n'inti P^^K*" 
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TABLE •i—Conimiiei/ 



icijiun 



Downsircam primer, scquoncc f5'-3') 



AJtcU-'kpvcific (upslreum) primer, 
wqgericc (5 -3 ) 



ii)H prodtjn 
Itnulh (bp) 



WNVfildSaCOM, (Phos)TG YTCCCGGOAG 
GAATn ATYGOAAAAGTCAACAG(fi- 
FAM) 

WNV6l[,8hCOM. (Phtis)TOCTCGCGAGAO 
0AATrrATAAGGAAGCTCAATAG(6- 
FAM) 

WNVfiifiikCOM, (PhoDTGCrCTCGAGAGO 
AKTrCATAAGAAAGCTCAACAGfh- 
FAM) 

WNV6244aCOM, (PhmjCGAAGAACGCCC 
GGCAAGCT0TAGAGGATCC(6-FAM) 

WNV6244bCOM, (Phos)GGAGGAGTOCTA 
GAGAAGC0GTT0AAGATCC(6-PAM) 

WNV6244<;COM, (Ph(M)CGAGGAOCGCCA 
GAGAAGCAGTT0AA0ATCC(6-FAM) 



73-76 



7,1. 74 



WNV6l68aG2p59, (NHjjCCACCTCGGCC R6-B8 

ACGCTCTOCTrACCCCOCGAYAARA 

AACCCAGGATG 
WNV()l68bGZp.'i9, (Nl I,)CCACCTCCGCC 88 

ACCCTtTGCCTOGCACGAGAAAAG 

COTCCCAOAATG 
WNV616acGZp.«, (NII,)CGACtTCUCCC HI 

ACGCTCTGtCTGGCCAGAGAAAAAC 

GTCCCAGAATG 
WNV6Z44a'r2p60, (NH,)GCCACTCGTCC 85 

CTCCCCCACAGGAGCOATGTTTGAA 

GAACAGAACCAAT 
WNV6244bTZp(>0, (NH;)Gt'CACTCGTCC «5-Sti 

CTCCGCCACGGGTGCCATOTTTGAA 

GARCAGAAYCAAT 



' Ihc ^ip wide cimplemcnli on ihe upjircim primer? m in hold, 
rncilinjj icmperifrurc 



CniK (Applied Bimysiiinu. Kiwlci t ily, CA) the smplifitaium prridmi* »crc 
vnualt/eJ hv clcctniptinrcff s in i f^'c narnic gel 

Ont-sifp mulllpln RWR. Allcrruiively. ihc RNA emraacd (mm cliniciil 
»jmpl« wa» juhjcvlccl lu n iinr-sirp mulliplc? I( l -Pt l< (OncSlep RT PCK kili 
Oi»6Cn, V«lcni. w CA l ni icfly. I S ,il „f RNA w«» nMcii m t SO |»l fln»l volume 
lonlumiiif Ix UncSlcp R'l K'K bulTcr, 0,4 mM (JNTPl, U6 nM catd PC R 
pnmci jnJ 2 of (JncSli-p KT PCR Encymc Mix, RTIm SCff for ,XI min) wsi 
(Ollowed hy iiiJtnalurdinin >lcp J1 Ipr I J mm and 45 cydct ot mplllicaiimi 
CX'U (iif lll,v (iO"C lui I mm. ami 7:'C tor I mm) with a fin.l ciicndun ilcp ai 
■"rr liii 10 mm 

Mull(plt» UDR. LDRs wvrt carnoil out in a final TOlumi; u( 20 y.{ tonuining 
S nl nl .imphficd I3NA. I -< LDR buH^r (20 mM Cm |pH 7.6J. II) mM MjCi,, 101) 
mM KCli, I mM NAD. I mM (liihiolhrcliitl. J.Ml Imol nf each I.13K pomcr' <nii 
II 01 i^M AKinI) DN'A Iieum 14, 25), 

J.UR mixlurci were ihermally cycled for 20cyclM aCnm tnil * min al 
MX Crnir to l,l3Ri, j miiiurc conlmning 7,5 pmni nf (acli IXIR primer wan 
phiHphijrylaicd m a .K)-|il km«i* reacuon mUcure coniaming I >i T4 liuaw hglTcr 
IKI inM Trii-Hri. Ill mM MgCI,,. Ill mM JitliiiiltifeHol, I mM A1P, 23 ng/ml 
l»ivmc wrum alhumm) and III (.1 of r4 liinaic (Ntw l-njland tHoLabs. Ipswich. 
MA) The mulure wa.^ inciihalcU ll .ITX' (or do min. (olliiwed hy 10 mm of 
incuh.itiiin al n.S'C ,\nd sioragc m 4'C 

C apillary tlwlnjphurtxii (CEI. A[icr Ihc tUR. Ihc minlurc was dllultd 1:10 
HI walei and I ,il «as added in li „l of a CV. mauler muiure conuming Hi-Oi 
(-ormamidc (Applied Bio<y»li;iiB,. Fi.sicr Ciiy, CA) and 0.1 ^1 iif OeilcScan SOO 
LIZ si!c iiiindard (Applied Biusyiiem, f-osicr Ciiy, CA) The CE minurc was 
dcnaiurcd ai ')4T Icir 2 mm and chilled gn icc. I'hc LDR products were analyzed 
on a J73U UNA analyicr (ApplK-d Biotyilcmi. Kosici Ciiy. CA) A lample wai 
conwdcred WNV pumpvc when a minimum of rwu LDR producli in any of iht 
ihiec PCR aniplicnn* wa« dciccied hy CI; 

Untvrrsal DNA mlcrrnrray ipotllni and hybrtdliailoii comlltlons. Universal 
micrnafrayx were prepared hy spoiling unique 20-mer rip rode oligonucleoildci 
(12 14, 15. 10) on aclivaied Cndelink slides |CF. Heallhcare, Piscalaway. N!) 
*ilh a CjArrayMini rohoiic array prmler (Gencln. Boilun, MA) according lo Ihe 
manulaciurcr'i in«liuciion,» Tlic iip code addre«ie.s were plated into a 3«4-well 
mitroplalt m 50 mM lodium phiuphalc (pH 8 5) ii a final conccniralion ol 2J 
nM A I (iM conccniralion 111 a hducial oligimuclcolidc wa.i jddc-d lo ihc 
prmlini? 'iiislurc in eHch welt and cospndcd with each 7.tp ctKic addrcH A 
cart).ii<v-X-rhndaiiiini;.ijt)clcd flduoal rrimplcmcnl was included in Iht hybrid- 
li'aiion mmurr lo dcicrmine Ihc |>osli«m and t)uuhly oi each spot Roholic 
pnnimg was lariivd mil M \(tC .nd 50 lo fifl* humidily Primed ilide« were 
iilcuhaled m ii wluraicci Nad ,hafflhei civcrnighl and (hen Irealed wilh a black- 
ing soloiKui nil M Iriv SU mM tlhamilammc |pH iJOII lo block residual 
leatiive carhoayl groups (lie iddes were wanhed with 4x SSC (2UH SSC is J M 
sodium cblonjc and I) 1 M lodium cicralt, pi I 7 0)-l),l% (odium dodecyl lullale 
ISDS) and <pm drreU hiuh printmg layout contained a total of Iti dp code 
addresses spoiled in Juplicalc (see Fig ID) Nine l\f code addresses were 



drtigned to hybndilc Ihc lip code complements appended lo Ihc WNV spctihv 
upstream LDR primers The other up codes on the array were complcmcniary 
111 »ip code complements on l.DR primers specific foi nihci Wmid-hmnc mal 
pathogens (dengue virus and othcl hemorrhajtlc fever virusts) LOR piiKlutls 
from a suhiei of nine mosi)uit<i p<H>l samples and from positive md negative 
toniruls were dcnatuied al irc for 3 mm and chilled on ice print lu hyhridira- 
lion to Ihe arrays, fhe hyhridizalion solulion etmsisied of the entire volume nl 
Ihc l.DR ptodutts, 5« -SSC, 0 1% SDS. 0,1 nifl'ml salmon sperm 0^A (Fishci 
.Scienliric). anil a S nM conccnirnlion of ihc fiducial compltmcnl in i lotal 
volume ol .11) |lI The hybridization solution was applied to ihc iiiicroanay slide 
with a muliichamber FioPtaie Slide Module (Grace Bio-Labs. Bend. OK) Tlic 
hybndizalioni were carried out tn the dark on a rocking plaKorm wiihin d 
hyhrldi^alton oven (Ijb-linc; VWR West Chester. PA) at W<. (or 2 h T he slides 
were rinfcedwiih5x SSC and washed with I x S-Sl -I) \% SI>Sii /or 15 mm 
Two more wnshcs liillowcd. hrsi with o 2 r .LSC ai 23-C for I mm and ilien wiih 
P ix -SSC at 22*C for 1 mm llic slides were spun dry and scinned on i 
ProScanArray migroarray scanner (Pcrkin-hJmcr, Wcllciley. MA) 

U)B, The limit of detection ILOD) ol ihe assay was measured wnh a WNV 
load panel containing plasma samples spikci) with dclincU diloliuns of NY'"; 
virus Ttie panel (tiers ranged (mm 180,(100 in (115 PFU/ml (quanlihcd by 
standard ptaque aniay), Dilulions tanging from 1,800 PFU/ml to 0,fn PhU/ml 
werelesled For each dilution, RNA wm eslraclcd from < |40-mI alii|iioi wiih the 
OlAamp viral RNA mini kit (Owgen, Valencia, CA) as described above The 
RNA was suhjeclcd to either the one -step oi (he iwoslop inulliplcs R I PCK as 
described above, followed by mulliplescd LDR and CE as dcstiibed earlier 
Thus, g dilution of I.HIIO PFU/ml cotrcspond-s lo a 1-OD o( 6! PFU Inr Ihc 
one-step method (a l40-iil aliquot w.ts cslracled mlu a liral volume of 60 nl of 
RNA, of whK'h 15 Hi was used (or Ibe R I-PCR) und 2 « PFU was used (or lb.- 
two-iiep method (a 140-nl aliquot was ettiracied into a hiial volume iil f>a iil ol 
RNA, of which 20 ,sl wai u«d m the RTsiep m a total volume ol 60ul Ofih.s 
60-Hl volume of cDNA, Z pi was used in (he PCR siep) 



RESULTS 

PCR/LDR/CE primer selection and validation. A multiplex 
PCR/LDR/CE assay was developed for delecting WNV busctl 
on ilic simultaneous .screening of three WNV genomic regions 
(Fig, I). The three target regions were initially icsted sepa- 
rately by performing PCR/LDR/CE assays with primers spe- 
cific for each region {Fig. :a io C) In lhi.5 wuy. il wjs possible 
to evaluate ihc primer performance for each region and lo 
ensure signal detection from each of Ihe e.tpecied loial of nine 
LDR products In ihc initial evaluation phase, performed with 
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no 2 Represenianvc CE profiles of hgalion producis fiiim ihf PCR/l,DR. (A lo C) C'E nrofile? from » unmlty Pr-R i nn f,„ „ ,r ,i, 
\ 0^1 ' ■ '<^P''^*"I LDR producis from rcgioni WNV554K, WNV5437, and WNVSM I rcwcciivdv (C) Pe il« ar H fi* 

J MnSlc pta di «0 bise^ jnd ihar the peak ai R6 bases merges wiih ihe peak at H8 bases, giving a .(mall peak al around fi<i bases Ruorc«.c ncf 
incciniiy is mdicaled un rhe y a<i>, and Ihe number of bases is mdicaled ™ Ihe r ukis. Huorcui nii 



WNV culiures, as well as WNV-posilive plasma samples ob- 
tained from the American Red Cross, ihc primers which failed 
to produce cither the PGR amplicon or one of the LDR prod- 
ucts were discarded and replaced with newly designed primers 
(data not shown), 

Rcgiun-spetific LDR products for each region produced 
peaks fit 73, 7^, and 80 bHScs for region I (along with a minor 
peak al 81 haws arising from ligation of a pnmer with u single 
t>ase difference ai one of the ligation .sites); 79. 86. and 88 bases 
for region 3; and 80. 82, and 84 bases for region 3 (Fig. 2A to 
C) Figure 2D shows the CE profile obtained when the t-DR 
was multiplexed for all three WNV regions. LDR product.^ 
with Identical length.', but different sequences (due to Ihe u.sc of 



degenerate oligonucleotides) can migrate ai .separate positions 
on CE, resulting in broadened peaks thai may overlap nearby 
peak.s. The algorithm for ihe identificaiion of a po.siiivc sample 
requires detecting the presence of at least two LDR products 
from any one amplicon or one LDR product from any rwo 
amplieons, i.e., al least Iwu separate peaks. No specific pc,ik 
need he prcscni. The cumulalivc prnhtc from niuliiplciccd 
LDR was sufficient lo positively identify a sample 

LOD, To evaluate the LOD of the WNV multiplex PCR,' 
LDR/CE assay, we tested a viral load panel conlaining plasma 
samples spiked with serial dilutions of WNV. The LOD was 
determined after performing RNA extraction. RT, and multi- 
plex PCR/LDRyCE. Knd It was calculated both for the assay 
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no ^ '^«;;™'n=4™.-/;l;^ P(-JVLDR/CE a,«ny CE prnlil., were ahinmcd afcr RNA c«r„c„.n and 



one sicp mulliplex KT;PCR/LDr< of Ihc WNV toad panel CE signals below a threshold of 200 ftuorocence units were considered 
Fluorescence inrensiiy ,s indicated on the y B»is. and the number of bases is indicated on the x "msidered 



ntgalive. 



ihiji used ihe rwo-stcp approach and for the one-step muliiplex 
RT-PCR 

In Ihc twu-slep approach, rhe RT was performed with ran- 
dom hcxaincri, followed by PCR amplificaiion wilh rcgion- 
spccilic primers This approach allowed Ihc dcleciion ol Ihc 
«ample dilutions from (he vtral load panel conlaining 1 1 PFU/ 
ml. This correspond* 10 a LODofO.017 PKU, Previous studies 
using preparations of the NY99 viru.s strain grown In Vcro colls 
have consisiendy shown that I PFU represents 500 copies (30; 
R Lanciolti personal communiciilion). Therefore, based upon 
these calculations, PCR/LDR has an LOD of approximately 
eight genome copies. 

The one. step multiplex RT-PCR approach allowed the de- 
tectum of the viral load panel dilutions containing 0,15 PFL7 



ml, about 70 times less conccniraied than that dclcdcd by the 
rwo-siep method. The LOD with this method corresptmds 10 
0,005 PFU or 2,5 genome copies (Fig. 3). This compares fa- 
vorably to other detection sysiem.s, including ihose that detcci 
both lineages I and 2, such as ihe FDA-licensed PROCI.f IX 
System (21, 23. 24). 

Sensitivity and sptcificlry of multiplex PCR/LDR/CE, The 
,sensitivify of the multiplex PCR/LDR/CE system was dctcr- 
minect with WNV cultures and environmental and clinical sam- 
ples, A sample was considered WNV positive when a minimum 
of two LDR products in any of Ihe three PCR amplicons was 
detected. WNV cultures included 3'1 strains from I y countries 
(Table I) which belonged to both lineages 1 and 2, as well as 
the Kunjin and Rabensburg viruses. AJI of the strains tested 
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positive, except for the Rabensburg virus and two Indian iso- 
lates, Both the Rabcnsburg virus and the two Indian isolates 
revcrihclcss produced positive PCR amplification products 
visible after gel electrophoresis, indicating that the LDR did 
not work. 

Ti) cvjlij^ie ihc .specilicity ul ihc meihiid, .seven other 
flaviviruses were tesied, as listed in Table 2. Although four 
111 them (.Si. Louis encephalitis virui, yellow fever virus 
Murriiy valley fever virus, and JEV) produced PCR ampli- 
(icuhon products delected by agarose gel electrophoresis, nu 
ligation products were obtained after the LDR. Ninety-eight 
pooled mosquito homogcnatcs which had previously tested 
positive according to the NYC DOHMH were subjected to 
the two-step multiplex PCR/LDR/CE a.ssay. All but one 
sample produced a positive signal, for a sensitivity of about 
99%. Twenty WNV-negative mosquito pools were also 
tested, and no false positives were found, 

Fifry WNV-positivc plasma samples with a representative 
range of concentrations (a minimum of 100 copies/ml) were 
obtained from the Oulf Coast Regional Blood Center in Texas 
These samples were subjected to RNA eitrraction and were 
tested in parallel by both iho two-step and onc-.slep multiplex 
PCR/LDRyCE methods. While the one-step approach de- 
tected WNV RNA in all 50 samples with 100% sensitivity, the 
two-step protocol displayed 82% sensitivity (41 out of 50 pos- 
itive samples detected) Ninety-two additional WNV-negative 
plasma samples, together with another 20 dengue virus-posi- 
live but WNV-negutivc samples (obtained from CDC, Puerto 
Rico), were included in the analysis. No false positives were 
detected from any of the total of 112 WNV-negative plasma 
samples, providing 100% specificity 

Universal DNA mlcroarray, A subset of nine WNV-posilivc 
mosquito pool samples was tested with the universal DNA 
mieroarray as an alternative readout system. Successful liga- 
tion of the LDR primers results in the formation of LDR 
products that bear a zip code complement at the 5' end and a 
fluorescent label at the 3' end. The universal DNA mieroarray 
permits the detection of the ligation products via hybridization 
01 the zip code complements to zip codes spoiled on the array 
The results obtained with these samples showed that the uni- 
versal array could detect a lluoresccnl signal from each of the 
nine different LDR products which correctly hybridized to 
their designated addresses on the array (Pig. 4). This indicates 
that the a.ssay can be performed by using either CE or a 
universal array as the final readout. 
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DISCUSSION 

In this rcpon, we descnbe the development of a new WNV 
detection method ba.scd on multiplex RT-PCR followed by LDR 
Our detection strategy was based on finding regions in the WNV 
genome that were most invflriant among the different strains 
belonging to both lineages. We then designed PCR primers that 
had the required .specificity to amplify WNV-specific RNA (after 
RT) while tolerating sequence variation without amplifying the 
fiif more abundant human RNA. Likewise, the LDR primers 
were designed to specifically ligate, even if the target .sequence 
vancd in up to three positions. The high .sensitivity of the initial 
R r-PC R .step, with degenerate primers, allows some tolerance to 
nuvmalthcv. which n complemented hv the high specificity ol the 
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FIG. 4, Representative WNV dclccIiDn witti a universal DNA mi- 
eronrray. (A) Schematic gf the mieroarray, Thr green hox Includo the 
area where llie nine WNV-specific zip codes were .spotted iii duplicak- 
(E4 to El 2 and F4 to Fl2), with red, green, and blue spots mdieaiinii 
Ihc three sets of zip codes corresponding to LDR products for ampli 
con regions 2, and 3, respectively. Other zip codes on ihc array arc 
designated for detecting dengue virus (ydlow spots) and other hem- 
orrhagic f"er viruses (white ,pou). (B) LDR producis from a reprc- 
ZTrlT ^^^-Pf pool sample revealing hyhnduation 
01 LDR products front each region to ibc correct zip codes The colors 
represcni ihc (Iworcscence mtcnsitv at each spoi, while being ihc 
stronsesl and blue bemg ihe weiiliesi, (C) LDR producLs from a WNV 
negative control mosquito pool sample (D) Hybridizaiion of ,i cai 
hoxy-X-rhodamine-labclcd liduciiil tomplcmiim mlcrniil cnnir,.! i„ 
verity uniform spotting of zip codts. 



LDR step. IS)H uses an exquisitely .specific thermostable AKlhD 
DNA ligasc that permits ligation only when the sequence at ihc 
junction between the paired oligonucleotides is complementary to 
the template sequence. This type of assay is ideal for multiplexing 
since several pnmer sets can ligate along a template without the 
interference encountered in polymcrase-bascd assays (18 19) 

The multiplex RT-PCR/LDR/CE test was evaluated with 
both mosquito pools and clinical samples, with the clinical 
samples being subjected to both one-step and two-step RT- 
PCR protocols. The sensitivity obtained with the mosquito 
pool samples was 98%; the only sample which gave a neg- 
ative result was retesied at the NYC DOHMH, where it was 
confirmed as negative, suggesting po.ssihic .sample dcgrada- 
non. " 

The WNV-positive clinical samples tested with ihc one-step 
protocol, which uses target-speciHc primers lor RT and P( 'R.s 
were detected with a sensitivity of 100%. On the other hand' 
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wilh (he two-siep method, where random hexamcrs are used 
for ihc RT siep, 82% of the samples gave a positive result. The 
higher sensitivity of the one-step versus the two-step approach 
may result from the belter performance of the former over ihc 
latter method, which wa-s demonstrated during LOD testing 
(Fig. 3). Although the sensitivity obtained by the two-step 
approach with mosquito pools did not seem to be affected by 
this drawback, it has been reponed that for low mRNA levels 
(like those expected to be found in clinical samples versus 
niosquiio pools), gene-specilic priming provides a more sensi- 
tive method (35), 

When the test was evaluated on 34 different WNV strains, 
the LDR failed lo delect the Rabcnsburg strain and the two 
Indian isolates belonging to cladc Ic or, as new evidence 
shows, hirming « d.slinu lifih hncagc (9). Genomic sequences 
for these isolaics were not available when the LDR primers 
were designed. Alignment of the Rabcnsburg and IND 8(M994 
strain sequences, which have since been pubhshed, reveal thai 
they arc too divergent to successfully anneal with the inilial 
primers Since LDRs can be highly multiplexed without com- 
promising ihc ligation cllicicn^y (14, 29), primers permitting 
the detection of these isolates may easily be incorptiraled into 
tiilure versions .if ihc as,say The tlenihiliiy of the technique will 
permit the expansion of the assay lo include emerging new 
WNV strains in a similar manner. 

Over the pasi few years, the PCR/LDR approach has been 
used for several applications in our laboratories (12, 14-16, 
IV). The u.sc iif LDR primers with specilic sequences ap- 
pended, termed "zip code" complements, has enabled the do- 
teciion of LDR products through u universal DNA microarray 
containing designated addre.s,sable up codes (16, 18), A uni- 
versal DNA microurray offers the advantage of being com- 
pleicly programmable and permits the inclusion of new 
genomic targei sequences without redesigning the array. In 
addition, different pathogens can be detected simultaneously 
since the hybridization event is mediated by the spotted zip 
code and zip code complements on the LDR primers in place 
of the actual pathogen's genomic sequence. By un«)upling 
pathogen detection from pathogen idcntilicaiion, the same 
type of array can be used simultaneously for different organ- 
i.sms without changing the spotted probes. 

Our group recently demonstrated the utility of PGR/ 
LDR/CE in the multiplexed detection of blood-borne bacterial 
infectious agents (29). Due to the frequently nonspecific clin- 
ical .symptoms of viral infections and the overlap of different 
arboviruses in the same geographic area, we envision a similar 
approach lo the detection of blood-borne viral pathogens, both 
in clinical specimens and for environmental surveillance The 
use of a multiplex RT-PCR/LDR for detection and a universal 
DNA microarray for idcniificalinn represents a convenient 
tool given the frequent sequence variation in RNA viruses 
which may necessitate additions tn the detection primers used. 
This approach can assist epidemiologists m rapidly tracking 
unknown and emerging strains of HFV, We have designed the 
same type of lest for the detection and serotype determination 
of dengue virus in giinical samples (unpublished data), as well 
as other hemorrhagic fever viruses, pavmg the way for a com- 
prehensive viral deiection method. 
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